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© NANBV diagnostics and vaccines. 



© A new virus, Hepatitis C virus (HCV), which has proven to be the major etiologic agent of blood-borne 
NANBH, was discovered by Applicant. The initial work on this virus, which includes a partial genomic sequence 
of the prototype HCV isolate, is described in EPO Pub. No. 318,216, and PCT Pub. No. WO/89/04669. The 
present invention, which in part is based on new HCV sequences and polypeptides which are not disclosed in 
the above-cited publications, includes the application of these new sequences and polypeptides in immunoas- 
says, probe diagnostics, anti-HCV antibody production, PCR technology, and recombinant DNA technology. 
Included within the invention also are novel immunogenic polypeptides encoded within clones containing HCV 
cDNA, novel methods for purifying an immuncgenic HCV polypeptide, and antisense polynucleotides derived 
from HCV cDNA. 
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Technical Field 

The invention relates to materials and methodologies for managing the spread of non-A non-B ^hepati tis 

as protective agents against the disease. 
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Background Art 

Non-A Non-B hepatitis (NANBH) is a transmissible disease or family of diseases that are believed to be 

ferial passage in chimpanzees provided evidence that NANBH is due to a transm.ss.ble .nfect,ous agent or 

epidemiologic evidence is suggestive that there may be three types of NANBH: the water-borne 
epidem'TypSe blood or need>e"associated type; and the sporadically occurring (commun.ty acqu.red) 
ty^fHowever the number of agents which may be the causative of NANBH are unknown. 

ClintaS d aanosis and identification of NANBH has been accomplished primarily by exclus.on of othe 
viral markers IZg the methods used to detect putative NANBV antigens and antibodies are agar-ge. 

radioimmunoassay, and enzyme-linked immunosorbent as say . Howeve ^ haS ^ 

antoens assorted with HBV, as well as to the integration of HBV DNA into the genome of l.ver cells In 
adEn there S the oossibili y that NANBH is caused by more than one infectious agent as well as the 

NANBH Ts wen as in patients with other hepatic and nonhepatic diseases. Some of the react.v.ty detected 
may represent antibody to host-determined cytoplasm.c antigens. 

There have been a number of candidate NANBV. See, for example the reviews by Pr nce 983 . 
: Feinltone and Hoofnag.e (1984), and Overby (1985, 1986. 1987) and the , «** bj Mwarson (1987). 
However, there is no proof that any of these candidates represent ^^^^X^^i and 

The demand for sensitive, specific methods for screening and .dent, fy.ng earners ofN ANBV a 
NANBV contaminated b.ood or ^ ^^s^^^^ J. ^ ™ £££ 
approximately 10% of transfused patients, and NANBH accounts ror up ra w 

c,o S r p r S ona contact require reliable screening, diagnostic and prognostic tools to detect n^ac^ 
aniens and a nt j b odies related to NANBV. In addition, there is also a need for effect*, vacc.nes and 
Snotherapeutic therapeutic agents for the prevention and/or treatment of the disease^ 
5 A PP .icant discovered a new virus, the Hepatitis C virus (HCV), wh.ch has proven to be he .major 
JgV* of blood-borne NANBH (BB—. = s ^ wo, . nc = a partial genom.c 

rwO^O^^ (published 1 June 198 9). The 
scfosur o b ! tese ' aten/applications, as well as any corresponding national Pj^JP£^~ 
i0 incorporated herein by reference. These applications teach, inter aha. recombinant ™£r^«<^ 
and expressing HCV sequences, HCV polypeptides. HCV immunod.agnost.c techn.ques. HCV probe 
di^nosrSniques, anti-HCV antibodies, and methods of isolating new hCV sequences, including 
sequences of new HCV isolates. 
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The present invention is based, in part, on new HCV sequences and polypeptides that are not disc.osed 



4 



EP 0 388 232 A1 



in EPO Pub. No. 318,216, or in PCT Pub. No. WO 89/04669. Included within the invention is the application 
of these new sequences and polypeptides in, inter alia, immunodiagnostics, probe diagnostics, anti-HCV 
antibody production, PCR technology and recombinant ON A technology. Included within the invention, also, 
are new immunoassays based upon the immunogenicity of HCV polypeptides disclosed herein. The new 
subject matter claimed herein, while developed using techniques described in, for example, EPO Pub. No. 
318,216, has a priority date which antecedes that publication, or any counterpart thereof. Thus, the invention 
provides novel compositions and methods useful for screening samples for HCV antigens and antibodies, 
and useful for treatment of HCV infections. 

Accordingly, one aspect of the invention is a recombinant polynucleotide comprising a sequence 
derived from HCV cDNA, wherein the HCV cDNA is in clone 13i, or clone 26j, or clone 59a, or clone 84a, or 
clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a. or clone CA290a, or clone ag30a. or clone 
205a, or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

Another aspect of the invention is a purified polypeptide comprising an epitope encoded within HCV 
cDNA wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 
8866 in Fig. 17. 

Yet another aspect of the invention is an immunogenic polypeptide produced by a cell transformed with 
a recombinant expression vector comprising an ORF of DNA derived from HCV cDNA, wherein the HCV 
cDNA is comprised of a sequence derived from the HCV cDNA sequence in clone CA279a, or clone CA74a, 
or clone 13i, or clone CA290a, or clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 
8f, or clone 33f, or clone 33g, or clone 39c, or clone I5e, and wherein the ORF is operably linked to a 
control sequence compatible with a desired host. 

Another aspect of the invention is a peptide comprising an HCV epitope, wherein the peptide is of the 
formula 

wherein x and y designate amino acid numbers shown in Fig. 17, and wherein the peptide is selected from 
the group consisting of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AA177, AA1-AA10, AA5-AA20, AA20-AA25, 
AA35-AA45, AA50-AA100, AA40-AA90, AA45-AA65, AA65-AA75, AA80-90, AA99-AA120, AA95-AA11o! 
AA105-AA120, AA100-AA150, AA150-AA200, AA155-AA170, AA190-AA210, AA200-AA250, AA220-AA24o! 
AA245-AA265, AA250-AA300, AA290-AA330. AA290-305, AA300-AA350, AA310-AA330, AA350-AA40o! 
AA380-AA395, AA405-AA495, AA400-AA450, AA405-AA415, AA415-AA425, AA425-AA435, AA437-AA582! 
AA450-AA500, AA440-AA460, AA460-AA470, AA475-AA495, AA500-AA550, AA511-AA690, AA515-AA550,' 
AA550-AA600, AA550-AA625, AA575-AA605. AA585-AA600, AA600-AA650, AA600-AA625. AA635-AA66s! 
AA650-AA700, AA645-AA680, AA700-AA750, AA700-AA725, AA700-AA750, AA725-AA775, AA770-AA79o! 
AA750-AA800, AA800-AA815, AA825-AA850, AA850-AA875, AA800-AA850, AA920-AA990, AA850-AA900,' 
AA920-AA945, AA940-AA965. AA970-AA990, AA950-AA1000. AA1000-AA1060, AA1000-AA1025, AA1000- 
AA1050, AA1025-AA1040, AA1040-AA1055, AA1075-AA1 175, AA1050-AA1200, AA1070-AA1 100, AA1100- 
AA1130, AA1140-AA1165, AA1 192-AA1457, AA1 1 95-AA1 250, AA1 200-AA1 225, AA1 225-AA1 25o! AA1250- 
AA1300, AA1260-AA1310, AA1260-AA1280, AA1266-AA1428, AA1300-AA1350, AA1290-AA131o! AA1310- 
AA1340, AA1345-AA1405, AA1345-AA1365, AA1 350-AA1 400. AA1 365-AA1 380, AA1380-AA1405," AA1400- 
AA1450, AA1450-AA1500. AA1460-AA1475, AA1 475-AA1 51 5. AA1475-AA1500, AA1500-AA155o! AA1500- 
AA1515, AA1515-AA1550. AA1 550- AA 1600. AA1545-AA1560, AA1569-AA1931 . AA1570-AA1590, AA1595- 
AA1610, AA1590-AA1650, AA1610-AA1645, AA1650-AA1690, AA1685-AA1770, AA1689-AA180s! AA1690- 
AA1720, AA1694-AA1735. AA1 720-AA1 745, AA1 745-AA1 770, AA1 750- AA 1800, AA1775-AA1 81 0, AA1795- 
AA1850. AA1850-AA1900, AA1900-AA1950, AA1900-AA1920, AA1916-AA2021 . AA1920-AA1940, AA1949- 
AA2124, AA1950-AA2000. AA1950-AA1985, AA1980-AA2000, AA2000-AA2050, AA2005-AA202s! AA2020- 
AA2045, AA2045-AA2100, AA2045-AA2070, AA2054-AA2223, AA2070-AA2100, AA2100-AA215o! AA2150- 
AA2200, AA2200-AA2250, AA2200-AA2325, AA2250-AA2330, AA2255-AA2270, AA2265-AA228o! AA2280- 
AA2290, AA2287-AA2385, AA2300-AA2350. AA2290-AA2310, AA2310-AA2330. AA2330-AA235o! AA2350- 
AA2400, AA2348-AA2464, AA2345-AA241 5. AA2345-AA2375, AA2370-AA2410. AA2371 -AA2502,' AA2400- 
AA2450, AA2400-AA2425, AA2415-AA2450, AA2445-AA2500, AA2445-AA2475, AA2470-AA2490, AA2500- 
AA2550, AA2505-AA2540, AA2535-AA2560. AA2550-AA2600, AA2560-AA2580, AA2600-AA2650,' AA2605- 
AA2620. AA2620-AA2650, AA2640-AA2660, AA2650-AA2700, AA2655-AA2670, AA2670-AA270o! AA2700- 
AA2750. AA2740-AA2760, AA2750-AA2800, AA2755-AA2780, AA2780-AA2830, AA2785-AA2810. AA2796- 
AA2886, AA2810-AA2825. AA2800-AA2850, AA2850-AA2900, AA2850-AA2865, AA2885-AA2905, AA2900- 
AA2950, AA2910-AA2930. AA2925-AA2950, AA2945-end(C' terminal). 

Still another aspect of the invention is a monoclonal antibody directed against an epitope encoded in 
HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 



EP 0 388 232 A1 



70 



7S 



20 



ftR^Q m fl866 in Fia 17 or is the sequence present in clone 13i. or clone 26j, or clone 59a, or clone 84a, or 
1^1: orlone Tel or Cone pM*. or clone CA216a, or clone CA290a. or clone ag30a. or clone 

205a YeTa?other 1 ?spe r ctoUhe Mention is a preparation of purified polyclonal antibodies directed against a 
poly^iS^SnSS of an epitope encoded within HCV cDNA, wherein the HCV cDNA ,s o^c, 
fnSed I by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or .s the se ^ ence .f^"^ 
clone d, or clone 26j, or clone 59a, or clone 84a. or clone CA156e, or clone 167b, or Cone o,14a, or Cone 
CA216a or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 16|h. 

Sanger aspect o, the invention is a polynucleotide probe for HCV, wherein the probe « compnjed 
of an HCV sequence derived from an HCV cDNA sequence indicated by nucleot.de numbers -319 to 1348 
or 8659 to 8866 in Fig. 17, or from the complement of the HCV cDNA sequence. ^ tiHoc 

Yet Another aspect of the invention is a kit for analyzing samples for the presence o P o ^ucleoM 
from HCV comprising a polynucleotide probe containing a nucleotide sequence of about 8 or more 
nuCeo^es Terefn the nucleotide sequence is derived from HCV cDNA which is of a sequence indicated 
by t2e^ number. -319 to 1348 or 8659 to 8866 in Fig. 17, wherein the po.ynucleot.de probe „ m a 

SUi Totherip e ect of the invention is a kit for anting samples for the presence of an HCV antigen 
comp is!ng an amibody which reacts immuno.ogically with an HCV antigen, where, .the ^ '9- con tarns an 
eoitooe encoded within HCV cDNA which is of a sequence indicated by nucleot.de numbers -319 to 1348 or 
«^r^66 in Fia 17 or wherein the HCV cDNA is in clone 13i, or clone 26j, or clone 59a, or clone 84a. 
T'l c^e "or cloned/; Tor Cone pi14a, or Cone CA216a, or Cone CA 2 90a. or Cone ag30a, or 

«° n X*™X°lZ is a kit for analyzing samples for the presence * „ .HCV antibody 

cSe 2et or clone 59a. or clone 84a, or Cone CA156e, or Cone 167b, or Cone p.Ua. or Cone CA2l6a, or 
clone CA290a. or clone ag30a. or Cone 205a, or clone 18g, or clone l6jh antibodv 
Another aspect of the invention is a kit for analyzing samples for the presence of an HCV ant.body 
comoris no an antigenic polypeptide expressed from HCV cDNA in clone CA279a, or clone CA74a. or Cone 
, isro^nTc^Si or ctoSe 33C or clone 40b, or clone 33b, or clone 25c, or Cone 14c or clone «. or 
cL 3 3 ror Cone 33g. or done 39c, or Cone 15e, wherein the antigenic polypeptide ,s present ,n a 

SUitable ^/"aspect of the invention is a method for detecting HCV nucleic acids in a samo.e 
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(aTreSg nucleic acids of the sample with a polynucleotide probe for HCV, wherein , ftc . probe is 
™!Lh nf an HCV seauence derived from an HCV cDNA sequence is of a sequence ind.cated by 
Sot de n mbers 31 to 4^ or 8659 to 8866 in Fig. 17, and wherein the reacting is under conditions 
Thich aL the formation of a polynucleotide duplex between the probe and the HW j^ wd from the 
samole- and (b) detecting a polynucleotide duplex which contains the probe, formed ,n step (a). 

Yet'anothe aspect of the invention is an immunoassay for detecting an HCV ant.gen compr.s.ng: 
fa) incubTZ a sample suspected of containing an HCV antigen with an antibody d.rected against an HCV 
flZTenZltTZcv cWA, wherein the HCV cDNA is of a sequence indicated by nuCeot.de numbers 
3 9 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i. or Cone 26, or Cone 59a 
or Cone 84^ or Cone CA156e. or clone 167b, or clone P i14a, or Cone CA216a. or Cone CA290a. or clone 
ac3oi or Cone 205a or clone 18g. or Cone I6jh, and wherein the incubating is under cond.t-ons ^wh.ch 
anew Jmation of an antigen-antibody complex; and (b) detecting an antibody-ant.gen complex formed .n 

^^^iZ^^^on is an immunoassay for detecting antibodies directed against an 

^U^TZS?***** of containing anti-HCV antibodies with an antigen^tide J~ 
an epitope encoded in HCV cDNA. wherein the HCV cDNA is of a sequence md.cated 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present clone 13., "or c one 
Hone 59a or clone 84a or clone CA156e, or clone 167b, or clone p.14a. or clone CA216a. or Cone 
CA290a or clone aq30a " or clone 205a. or clone 18g, or clone I6jh, and wherein the incubatong .s under 
condlns wlh a?ow formSion of an antigen-antibody complex; and detecting an antibody-an,gen 
mmniPx formed in steD (a) which contains the antigen polypeptide. 

Che aspect oUhe nvention is a vaccine for treatment of HCV infection comprising an .mmunogen.c 
P o. y pepS containing an HCV epitope encoded in HCV cDNA, wherein the HCV cDNA .s of a sequence 
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indicated by nucleotide numbers -319 to 1348 or 3659 to 8866 in Fig. 17 or is the sequence present in 
clone I3i, or clone 26j. or clone 59a, or clone 84a, or clone CA156e, or clone 167b. or clone pi 14a. or clone 
CA2l6a. or clone CA290a. or clone ag30a, or clone 205a, or clone I8g, or clone I6jh, and wherein the 
immunogenic polypeptide is present in a pharmacologically effective dose in a pharmaceutical^ acceptable 
s excipient. 

Yet another aspect of the invention is a method for producing antibodies to HCV comprising administer- 
ing to an individual an isolated immunogenic polyeptide containing an HCV epitope encoded in HCV cDNA, 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in 
Fig. 17, or is of the sequence present in clone CA279a. or clone CA74a. or clone 13i, or clone CA290a, or 

10 clone 33C or clone 40b. or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f, or clone 33g, or 
clone 39c, or clone 15e. and wherein the immunogenic polypeptide is present in a pharmacologically 
effective dose in a pharmaceutical^ acceptable excipient. 

Still another aspect of the invention is an antisense polynucleotide derived from HCV cDNA, wherein the 
HCV cDNA is that shown in Fig. 17. 

*s Yet another aspect of the invention is a method for preparing purified fusion polypeptide C100-3 
comprising: 

(a) providing a crude cell lysate containing polypeptide C100-3, 

(b) treating the crude cell lysate with an amount of acetone which causes the polypeptide to 
precipitate, g and soiubilizing the precipitated 

20 (c) isolatin material, 

(d) isolating the C100-3 polypeptide by anion exchange chromatography, and 

(e) further isolating the C 100-3 polypeptide of step (d) by gel filtration. 



25 Brief Description of the Drawings 

Fig. 1 shows the sequence of the HCV cDNA in clone I2f. and the amino acids encoded therein. 

Fig. 2 shows the HCV cDNA sequence in clone k9-1, and the amino acids encoded therein. 

Fig. 3 shows the sequence of clone 15e, and the amino acids encoded therein. 
30 Fig. 4 shows the nucleotide sequence of HCV cDNA in clone 13i, the amino acids encoded therein, 

and the sequences which overlap with clone 12f. 

Fig. 5 shows the nucleotide sequence of HCV cDNA in clone 26j, the amino acids encoded therein, 
and the sequences which overlap clone 13i. 

Fig. 6 shows the nucleotide sequence of HCV cDNA in clone CA59a, the amino acids encoded 
35 therein, and the sequences which overlap with clones 26j and K9-1 . 

Fig. 7 shows the nucleotide sequence of HCV cDNA in clone CA84a, the amino acids encoded 
therein, and the sequences which overlap with clone CA59a. 

Fig. 8 shows the nucleotide sequence of HCV cDNA in clone CA156e, the amino acids encoded 
therein, and the sequences which overlap with CA84a. 
4 0 Rg. 9 shows the nucleotide sequence of HCV cDNA in clone CA167b, the amino acids encoded 

therein, and the sequences which overlap CAl56e. 

Fig. 10 shows the nucleotide sequence of HCV cDNA in clone CA216a, the amino acids encoded 
therein, and the overlap with clone CA167b. 

Fig. 11 shows the nucleotide sequence of HCV cDNA in clone CA290a, the amino acids encoded 
45 therein, and the overlap with clone CA2l6a. 

Fig. 12 shows the nucleotide sequence of HCV cDNA in clone ag30a and the overlap with clone 
CA290a. 

Fig. 13 shows tho nucleotide sequence of HCV cDNA in clone CA205a, and the overlap with the HCV 
cDNA sequence in clone CA290a. 
so Fig. 14 shows the nucleotide sequence of HCV cDNA in clone 18g, and the overlap with the HCV 

cDNA sequence in clone ag30a. 

Fig. 15 shows the nucleotide sequence of HCV cDNA in clone 16jh, the amino acids encoded therein, 
and the overlap of nucleotides with the HCV cDNA sequence in clone 1 5e. 

Fig. 16 shows the ORF of HCV cDNA derived from clones pi 14a, CAl67b. CA156e. CA84a, CA59a, 
55 K9-1, I2f. 14i, 11b, 7f, 7e. 8h, 33c, 40b. 37b, 35. 36, 81, 32, 33b, 25c, 14c, 8f, 33f. 33g. 39c, 35f, 19g, 26g, 
and I5e. 

Fig. 17 shows the sense strand of the compiled HCV cDNA sequence derived from the above- 
described clones and the compiled HCV cDNA sequence published in EPO Pub. No. 318,216. The clones 
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from which the sequence was derived are b114a, 18g, ag30a, CA205a, CA290a, CA216a, pi14a t CA167b, 
CA156e, CA84a, CA59a, K9-1 (also called k9-1),26j, 13i, I2f, 14i, 11b, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81, 
32 t 33b, 25c, 14c, 8f, 33f. 33g, 39c, 35f, 19g, 26g, 15e, b5a, and 16jh. In the figure the three horizontal 
dashes above the sequence indicate the position of the putative initiator methionine codon; the two vertical 
5 dashes indicate the first and last nucleotides of the published sequence. Also shown in the figure is the 
amino acid sequence of the putative polyprotein encoded in the HCV cDNA. 

Fig. 18 is a diagram of the immunological colony screening method used in antigenic mapping 
studies. 

Fig. 19 shows the hydrophobicity profiles of poiyproteins encoded in HCV and in West .Nile virus. 
w Fig. 20 is a tracing of the hydrophilicity/hydrophobicity profile and of the antigenic index of the 

putative HCV polyprotein. 

Fig. 21 shows the conserved co-linear peptides in HCV and Flaviviruses. 

Modes for Carrying Out the Invention 

75 

I. Definitions 

The term "hepatitis C virus" has been reserved by workers in the field for an heretofore unknown 
20 etiologic agent of NANBH. Accordingly, as used herein, "hepatitis C virus" (HCV) refers to an agent 
causitive of NANBH, which was formerly referred to as NANBV and/or BB-NANBV. The terms HCV, 
NANBV, and BB-NANBV are used interchangeably herein. As an extension of this terminology, the disease 
caused by HCV, formerly called NANB hepatitis (NANBH), is called hepatitis C. The terms NANBH and 
hepatitis C may be used interchangeably herein. 
25 The term "HCV", as used herein, denotes a viral species of which pathogenic strains cause NANBH, 
and attenuated strains or defective interfering particles derived therefrom. As shown infra., the HCV genome 
is comprised of RNA. it is known that RNA containing viruses have relatively high rates of spontaneous 
mutation, i.e., reportedly on the order of 10~ 3 to 10~ 4 per incorporated nucleotide (Fields & Knipe (1986)). 
Therefore, there are multiple strains, which may be virulent or avirulent. within the HCV species described 
30 infra. The compositions and methods described herein, enable the propagation, identification, detection, and 
isolation of the various HCV strains or isolates. Moreover, the disclosure herein allows the preparation of 
diagnostics and vaccines for the various strains, as well as compositions and methods that have utility in 
screening procedures for anti-virai agents for pharmacologic use, such as agents that inhibit replication of 
HCV. 

35 The information provided herein, although derived from the prototype strain or isolate of HCV, 

hereinafter referred to as CDC/HCV1 (also called HCV1), is sufficient to allow a viral taxonomist to identify 
other strains which fall within the species. The information provided herein allows the belief that HCV is a 
Flavi-like virus. The morphology and composition of Flavivirus particles are known, and are discussed in 
Brinton (1986). Generally, with respect to morphology, Flaviviruses contain a central nucleocapsid sur- 

40 rounded by a lipid biiayer. Virions are spherical and have a diameter of about 40-50 nm. Their cores are 
about 25-30 nm in diameter. Along the outer surface of the virion envelope are projections that are about 5- 
10 nm long with terminal knobs about 2 nm in diameter. 

Different strains or isolates of HCV are expected to contain variations at the amino acid and nucleic 
acids compared with the prototype isolate, HCV1. Many isolates are expected to show much (i.e. more than 

45 about 40%) homology in the total amino acid sequence compared with HCV1. However, it may also be 
found that other less homologous HCV isolates. These would be defined as HCV strains according to 
various criteria such as an ORF of approximately 9,000 nucleotides to ap proximately 12,000 nucleotides, 
encoding a polyprotein similar in size to that of HCV1 , an encoded polyprotein of similar hydrophobic and 
antigenic character to that of HCV1 , and the presence of co-linear peptide sequences that are conserved 

so with HCV1. In addition, the genome would be a positive-stranded RNA. 

HCV encodes at least one epitope which is immunologically identifiable with an epitope in the HCV 
genome from which the cDNAs described herein are derived; preferably the epitope is contained an amino 
acid sequence described herein. The epitope is unique to HCV when compared to other known Flaviviruses. 
The uniqueness of the epitope may be determined by its immunological reactivity with anti-HCV antibodies 

55 and lack of immunological reactivity with antibodies to other Flavivirus species. Methods for determining 
immunological reactivity are known in the art, for example, by radioimmunoassay, by Eiisa assay, by 
hemagglutination, and several examples of suitable techniques for assays are provided herein. 

In addition to the above, the following parameters of nucleic acid homology and amino acid homology 
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are applicable, either alone or in combination, in identifying a strain or isolate as HCV. Since HCV strains 
and isolates are evolutionary related, it is expected that the overall homology of the genomes at the 
nucleotide level probably will be about 40% or greater, probably about 60% or greater, and even more 
probably about 80% or greater; and in addition that there will be corresponding contiguous sequences of at 
5 least about 13 nucleotides. The correspondence between the putative HCV strain genomic sequence and 
the COC/HCV1 cDNA sequence can be determined by techniques known in the art. For example, they can 
be determined by a direct comparison of the sequence information of the polynucleotide from the putative 
HCV, and the HCV cDNA sequence(s) described herein. For example, also, they can be determined by 
hybridization of the polynucleotides under conditions which form stable duplexes between homologous 
70 regions (for example, those which would be used prior to Si digestion), followed by digestion with single 
stranded specific nuclease(s), followed by size determination of the digested fragments. 

Because of the evolutionary relationship of the strains or isolates of HCV, putative HCV strains or 
isolates are identifiable by their homology at the polypeptide level. Generally. HCV strains or isolates are 
expected to be more than about 40% homologous, probably more than about 70% homologous, and even 
75 more probably more than about 80% homologous, and some may even be more than about 90% 
homologous at the polypeptide level. The techniques for determining amino acid sequence homology are 
known in the art. For example, the amino acid sequence may be determined directly and compared to the 
sequences provided herein. Alternatively the nucleotide sequence of the genomic material of the putative 
HCV may be determined (usually via a cDNA intermediate), the amino acid sequence encoded therein can 
20 be determined, and the corresponding regions compared. 

As used herein, a polynucleotide "derived from" a designated sequence refers to a polynucleotide 
sequence which is comprised of a sequence of approximately at least about 6 nucleotides, preferably at 
least about 8 nucleotides, more preferably at least about 10-12 nucleotides, and even more preferably at 
least about 15-20 nucleotides corresponding to a region of the designated nucleotide sequence. 
25 "Corresponding " means homologous to or complementary to the designated sequence. Preferably, the 
sequence of the region from which the polynucleotide is derived is homologous to or complementary to a 
sequence which is unique to an HCV genome. Whether or not a sequence is unique to the HCV genome 
can be determined by techniques known to those of skill in the art. For example, the sequence can be 
compared to sequences in databanks, e.g., Genebank, to determine whether it is present in the uninfected 
30 host or other organisms. The sequence can also be compared to the known sequences of other viral 
agents, including those which are known to induce hepatitis, e.g., HAV, HBV, and HDV, and to other 
members of the Flaviviridae. The correspondence or non-correspondence of the derived sequence to other 
sequences can also be determined by hybridization under the appropriate stringency conditions. Hybridiza- 
tion techniques for determining the complementarity of nucleic acid sequences are known in the art. and 
35 are discussed infra. See also, for example, Maniatis et al. (1982). In addition, mismatches of duplex 
polynucleotides formed by hybridization can be determined by known techniques, including for example, 
digestion with a nuclease such as S1 that specifically digests single-stranded areas in duplex poly- 
nucleotides. Regions from which typical DNA sequences may be "derived" include but are not limited to, 
for example, regions encoding specific epitopes, as well as non-transcribed and/or non-translated regions. 
40 The derived polynucleotide is not necessarily physically derived from the nucleotide sequence shown, 
but may be generated in any manner, including for example, chemical synthesis or DNA replication or 
reverse transcription or transcription. In addition, combinations of regions corresponding to that of the 
designated sequence may be modified in ways known in the art to be consistent with an intended use. 

Similarly, a polypeptide or amino acid sequence "derived from" a designated nucleic acid sequence 
45 refers to a polypeptide having an amino acid sequence identical to that of a polypeptide encoded in the 
sequence, or a portion thereof wherein the portion consists of at least 3-5 amino acids, and more preferably 
at least 8-10 amino acids, and even more preferably at least 11-15 amino acids, or which is im- 
munologically identifiable with a polypeptide encoded in the sequence. 

A recombinant or derived polypeptide is not necessarily translated from a designated nucleic acid 
so sequence, for example, the HCV cDNA sequences described herein, or from an HCV genome; it may be 
generated in any manner, including for example, chemical synthesis, or expression of a recombinant 
expression system, or isolation from mutated HCV. A recombinant or derived polypeptide may include one 
or more analogs of amino acids or unnatural amino acids in its sequence. Methods of inserting analogs of 
amino acids into a sequence are known in the art. It also may include one or more labels, which are known 
55 to those of skill in the art. 

The term "recombinant polynucleotide" as used herein intends a polynucleotide of genomic, cDNA, 
semisynthetic, or synthetic origin which, by virtue of its origin or manipulation which: (1) is not associated 
with all or a portion of a polynucleotide with which it is associated in nature, (2) is linked to a polynucleotide 
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other than that to which it is linked in nature, or (3) does not occur in nature. 

The term "polynucleotide" as used herein refers to a polymeric form of nucleotides of any length, either 
ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. 
Thus, this term includes double- and single-stranded DNA, as well as double- and single stranded RNA. It 
also includes known types 'of modifications, for example, labels which are known in the art, methylation, 
"caps", substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide 
modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, 
phosphotriesters, phosphoamidates, carbamates, etc.) and with charged linkages (e.g., phosphorothioates, 
phosphorodithioates, etc.), those containing pendant moieties, such as, for iexample proteins (including for 
e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.). those with intercalators (e.g., 
acridine, psoralen, etc.), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals, 
etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.). as 
well as unmodified forms of the polynucleotide. 

The term "purified viral polynucleotide" refers to an HCV genome or fragment thereof which is 
essentially free, i.e., contains less than about 50%, preferably less than about 70%, and even more 
preferably less than about 90% of polypeptides with which the viral polynucleotide is naturally associated. 
Techniques for purifying viral polynucleotides from viral particles are known in the art, and include for 
example, disruption of the particle with a chaotropic agent, differential extraction and separation of the 
polynucleotide(s) and polypeptides by ion-exchange chromatography, affinity chromatography, and sedi- 
mentation according to density. 

The term "purified viral polypeptide" refers to an HCV polypeptide or fragment thereof which is 
essentially free, i.e., contains less than about 50%, preferably less than about 70%, and even more 
preferably less than about 90%, of cellular components with which the viral polypeptide is naturally 
associated. Techniques for purifying viral polypeptides are known in the art. and examples of these 
techniques are discussed infra. The term "purified viral polynucleotide" refers to an HCV genome or 
fragment thereof which is essentially free, i.e., contains less than about 20%, preferably less than about 
50%, and even more preferably less than about 70% of polypeptides with which the viral polynucleotide is 
naturally associated. Techniques for purifying viral polynucleotides from viral particles are known in the- art, 
and include for example, disruption of the particle with a chaotropic agent, and separation of the 
polynucleotide(s) and polypeptides by ion-exchange chromatography, affinity chromatography, and sedi- 
mentation according to density. 

"Recombinant host cells", "host cells", "cells", "cell lines", "cell cultures", and other such terms 
denoting microorganisms or higher eukaryottc cell lines cultured as unicellular entities refer to cells which 
can be, or have been, used as recipients for recombinant vector or other transfer DNA, and include the 
progeny of the original cell which has been transfected. It is understood that the progeny of a single 
parental cell may not necessarily be completely identical in morphology or in genomic or total DNA 
complement as the original parent, due to natural, accidental, or deliberate mutation. 

A "replicon" is any genetic element, e.g., a plasmid, a chromosome, a virus, a cosmid, etc. that 
behaves as an autonomous unit of polynucleotide replication within a cell; i.e., capable of replication under 
its own control. ^ 

A "vector" is a replicon in which another polynucleotide segment is attached, so as to bring about the 
replication and/or expression of the attached segment. 

"Control sequence" refers to polynucleotide sequences which are necessary to effect the expression of 
coding sequences to which they are ligated. The nature of such control sequences differs depending upon 
the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding 
site, and terminators; in eukaryotes, generally, such control sequences include promoters, terminators and, 
in some instances, enhancers. The term "control sequences" is intended to include, at a minimum, all 
components whose presence is necessary for expression, and may also include additional components 
whose presence is advantageous, for example, leader sequences. 

"Operably linked" refers to a juxtaposition wherein the components so described are in a relationship 
permitting them to function in their intended manner. A control sequence "operably linked" to a coding 
sequence is ligated in such a way that expression of the coding sequence is achieved under conditions 
compatible with the control sequences. 

An "open reading frame" (ORF) is a region of a polynucleotide sequence which encodes a polypeptide; 
this region may represent a portion of a coding sequence or a total coding sequence. 

A "coding sequence" is a polynucleotide sequence which is transcribed into mRNA and/or translated 
into a polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of 
the coding sequence are determined by a translation start codon at the 5 -terminus and a translation stop 
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codon at the 3'-terminus. A coding sequence can include, but is not limited to mRNA, cDNA, and 
recombinant polynucleotide sequences. 

"Immunologically identifiable with/as" refers to the presence of epitope(s) and polypeptides(s) which are 
also present in the designated polypeptide(s), usually HCV proteins. Immunological identity may be 
5 determined by antibody binding and/or competition in binding; these techniques are known to those of 
average skill in the art, and are also illustrated infra. 

As used herein, "epitope" refers to an antigenic determinant of a polypeptide; an epitope could 
comprise 3 amino acids in a spatial conformation which is unique to the epitope, generally an epitope 
consists of at least 5 such amino acids, and more usually, consists of at least 8-10 such amino acids. 
w Methods of determining the spatial conformation of amino acids are known in the art, and include, for 
example, x-ray crystallography and 2-dimensional nuclear magnetic resonance. 

A polypeptide is "immunologically reactive" with an antibody when it binds to an antibody due to 
antibody recognition of a specific epitope contained within the polypeptide. Immunological reactivity may be 
determined by antibody binding, more particularly by the kinetics of antibody binding, and/or by competition 
is in binding using as competitor(s) a known polypeptide(s) containing an epitope against which the antibody 
is directed. The techniques for determining whether a polypeptide is immunologically reactive with an 
antibody are known in the art. 

As used herein, the term "immunogenic polypeptide" is a polypeptide that, elicits a cellular and/or 
humoral response, whether alone or linked to a carrier in the presence or absence of an adjuvant. 
20 The term "polypeptide" refers to a polymer of amino acids and does not refer to a specific length of the 
product; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This 
term also does not refer to or exclude post-expression modifications of the polypeptide, for example, 
glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, 
polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino 
25 acids, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both 
naturally occurring and non-naturally occurring. 

"Transformation", as used herein, refers to the insertion of an exogenous polynucleotide into a host ceil, 
irrespective of the method used for the insertion, for example, direct uptake, transduction, f-mating or 
eiectroporation. The exogenous polynucleotide may be maintained as a non-integrated vector, for example. 
30 a plasmid, or alternatively, may be integrated into the host genome. 
. "Treatment" as used herein refers to prophylaxis and/or therapy. 
An "individual", as used herein, refers to vertebrates, particularly members of the mammalian species, 
and includes but is not limited to domestic animals, sports animals, and primates, including humans. 

As used herein, the "sense strand" of a nucleic acid contains the sequence that has sequence 
35 homology to that of mRNA. The "anti-sense strand" contains a sequence which is complementary to that of 
the "sense strand". 

As used herein, a "positive stranded genome" of a virus is one in which the genome, whether RNA or 
ONA, is single-stranded and which encodes a viral polypeptide(s). Examples of positive stranded RNA 
viruses include Togaviridae, Coronaviridae, Retroviridae, Picornaviridae. and Caliciviridae. Included also, are 
40 the Flaviviridae, which were formerly classified as Togaviradae. See Fields & Knipe (1986). 

As used herein, "antibody-containing body component" refers to a component of an individual's body 
which is a source of the antibodies of interest. Antibody containing body components are known in the art, 
and include but are not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external 
sections of the respiratory, intestinal, and genitourinary tracts, tears, saliva, milk, white blood cells, and 
45 myelomas. 

As used herein, "purified HCV" refers to a preparation of HCV which has been isolated from the cellular 
constituents with which the virus is normally associated, and from other types of viruses which may be 
present in the infected tissue. The techniques for isolating viruses are known to those of skill in the art, and 
include, for example, centrifugation and affinity chromatography; a method of preparing purified HCV is 
so discussed infra. 

The term "HCV particles" as used herein include entire virion as well as particles which are 
intermediates in virion formation. HCV particles generally have one or more HCV proteins associated with 
the HCV nucleic acid. 

As used herein, the term "probe" refers to a polynucleotide which forms a hybrid structure with a 
55 sequence in a target region, due to complementarity of at least one sequence in the probe with a sequence 
in the target region. The probe, however, does not contain a sequence complementary to sequence(s) used 
to prime the polymerase chain reaction. 

As used herein, the term "target region" refers to a region of the nucleic acid which is to be amplified 
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and/or detected; 

As used herein, the term "viral RNA", which includes HCV RNA, refers to RNA from the virai genome, 
fragments thereof, transcripts thereof, and mutant sequences derived therefrom. 

As used herein, a "biological sample" refers to a sample of tissue or fluid isolated from an individual, 
5 including but not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external sections of 
the skin, respiratory, intestinal, and genitourinary tracts, tears, saliva, milk, blood cells, tumors, organs, and 
also samples of in vitro cell culture constituents (including but not limited to conditioned medium resulting 
from the growth of cells in cell culture medium, putativeiy viraily infected cells, recombinant cells, and cell 
components). 



II. Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, conventional techniques 
75 of molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. 
Such techniques are explained fully in the literature. See e.g., Maniatis, Fitsch & Sambrook, MOLECULAR 
CLONING; A LABORATORY MANUAL (1982); DNA CLONING, VOLUMES I AND II (D.N Glover ed. 1985); 
OLIGONUCLEOTIDE SYNTHESIS (M.J. Gait ed, 1984); NUCLEIC ACID HYBRIDIZATION (B.D. Hames & 
S.J. Higgins eds. 1984); TRANSCRIPTION AND TRANSLATION (B.D. Hames & S.J. Higgins eds. 1984); 
20 ANIMAL CELL CULTURE (R.I. Freshney ed. 1986); IMMOBILIZED CELLS AND ENZYMES (IRL Press, 
1986); B. Perbai. A PRACTICAL GUIDE TO MOLECULAR CLONING (1984); the series, METHODS IN 
ENZYMOLOGY (Academic Press, Inc.); GENE TRANSFER VECTORS FOR MAMMALIAN CELLS (J.H. 
Miller and M.P. Calos eds. 1987, Cold Spring Harbor Laboratory), Methods in Enzymology Vol. 154 and Vol. 
155 (Wu and Grossman, and Wu, eds., respectively), Mayer and Walker, eds. (1987), IMMUNOCHEMICAL 
25 METHODS IN CELL AND MOLECULAR BIOLOGY (Academic Press, London), Scopes, (1987), PROTEIN 
PURIFICATION: PRINCIPLES AND PRACTICE. Second Edition {Springer-Verlag, N.Y.), and HANDBOOK 
OF EXPERIMENTAL IMMUNOLOGY, VOLUMES l-IV (D.M. Weir and C. C. Blackwel! eds 1986). All patents, 
patent applications, and publications mentioned herein, both supra and infra, are hereby incorporated herein 
by reference. 

30 The useful materials and processes of the present invention are made possible by the provision of a 

family of nucleotide sequences isolated from cDNA libraries which contain HCV cDNA sequences. These 
cDNA librar ies were derived from nucleic acid sequences present in the plasma of an HCV-infected 
chimpanzee. The construction of one of these libraries, the "c" library (ATCC No. 40394), was reported in 
EPO Pub. No. 318,216. Several of the clones containing HCV cDNA reported herein were obtained from the 

35 "c" library. Although other clones reported herein were obtained from other HCV cDNA libraries, the 
presence of clones containing the sequences in the "c" library was confirmed. As discussed in EPO Pub. 
No. 318,216, the family of HCV cDNA sequences isolated from the "c" library are not of human or 
chimpanzee origin, and show no significant homology to sequences contained within the HBV genome. 

The availability of the HCV cDNAs described herein permits the construction of polynucleotide probes 

40 which are reagents useful for detecting virai polynucleotides in biological samples, including donated blood. 
For example, from the sequences it is possible to synthesize DNA oligomers of about 8-10 nucleotides, or 
larger, which are useful as hybridization probes to detect the presence of HCV RNA in, for example, 
donated blood, sera of subjects suspected of harboring the virus, or ceil culture systems in which the virus 
is replicating. In addition, the cDNA sequences also allow the design and production of HCV specific 

45 polypeptides which are useful as diagnostic reagents for the presence of antibodies raised during HCV 
infection. Antibodies to purified polypeptides derived from the cDNAs may also be used to detect viral 
antigens in biological samples, including, for example, donated blood samples, sera from patients with 
NANBH, and in tissue culture systems being used for HCV replication. Moreover, the immunogenic 
polypeptides disclosed herein, which are encoded in portions of the ORF of HCV cDNA shown in Fig. 17, 

so are also useful for HCV screening, diagnosis, and treatment, and for raising antibodies which are also useful 
for these purposes. 

In addition, the novel cDNA sequences described herein enable further characterization of the HCV 
genome. Polynucleotide probes and primers derived from these sequences may be used to amplify 
sequences present in cDNA libraries, and/or to screen cDNA libraries for additional overlapping cDNA 
55 sequences, which, in turn, may be used to obtain more overlapping sequences. As indicated infra, and in 
EPO Pub. No. 318,216, the genome of HCV appears to be RNA comprised primarily of a large open 
reading frame (ORF) which encodes a large polyprotein. 

The HCV cDNA sequences provided herein, the polypeptides derived from these sequences, and the 



12 



EP 0 388 232 A1 



immunogenic polypeptides described herein, as well as antibodies directed against these polypeptides are 
also useful in the isolation and identification of the blood-borne NABV (BB-NANBV) agent(s). For example, 
antibodies directed against HCV epitopes contained in polypeptides derived from the cDNAs may be used 
in processes based upon affinity chromatography to isolate the virus. Alternatively, the antibodies may be 

s used to identify viral particles isolated by other techniques. The viral antigens and the genomic material 
within the isolated viral particles may then be further characterized. 

In addition to the above, the information provided infra allows the identification of additional HCV strains 
or isolates. The isolation and characterization of the additional HCV strains or isolates may be accomplished 
by isolating the nucleic acids from body components which contain viral particles and/or viral RNA, creating 

io cDNA libraries using polynucleotide probes based on the HCV cDNA probes described infra., screening the 
libraries for clones containing HCV cDNA sequences described infra., and comparing the HCV cDNAs from 
the new isolates with the cDNAs described infra. The polypeptides encoded therein, or in the viral genome, 
may be monitored for immunological cross-reactivity utilizing the polypeptides and antibodies described 
supra. Strains or isolates which fit within the parameters of HCV, as described in the Definitions section, 

/5 supra., are readily identifiable. Other methods for identifying HCV strains will be obvious to those of skill in 
the art. based upon the information provided herein. 



Isolation of the HCV cDNA Sequences 

20 

The novel HCV cDNA sequences described infra, extend the sequence of the cDNA to the HCV 
genome reported in EPO Pub. No. 318,216. The sequences which are present in clones b114a, 18g, ag30a, 
CA205a, CA290a, CA2l6a, pil4a, CA167b, CA156e, CA84a, and CA59a lie upstream of the reported 
sequence, and when compiled, yield nucleotides nos. -319 to 1348 of the composite HCV cDNA sequence. 

25 (The negative number on a nucleotide indicates its distance upstream of the nucleotide which starts the 
putative initiator MET codon.) The sequences which are present in clones b5a and 16jh lie downstream of 
the reported sequence, and yield nucleotides nos. 8659 to 8866 of the composite sequence. The composite 
HCV cDNA sequence which includes the sequences in the aforementioned clones, is shown in Fig. 17. 

The novel HCV cDNAs described herein were isolated from a number of HCV cDNA libraries, including 

30 the "c" library present in lambda gt11 (ATCC No. 40394). The HCV cDNA libraries were constructed using 
pooled serum from a chimpanzee with chronic HCV infection and containing a high titer of the virus, i.e.., at 
least 10 s chimp infectious doses/ml (CID/ml). The pooled serum was used to isolate viral particles; nucleic 
acids isolated from these particles was used as the template in the construe tion of cDNA libraries to the 
viral genome. The procedures for isolation of putative HCV particles and for constructing the "c" HCV cDNA 

35 library is described in EPO Pub. No. 318,216. Other methods for constructing HCV cDNA libraries -are 
known in the art, and some of these methods are described infra., in the Examples. Isolation of the 
sequences was by screening the libraries using synthetic polynucleotide probes, the sequences of which 
were derived from the 5 -region and the 3 -region of the known HCV cDNA sequence. The description of 
the method to retrive the cDNa sequences is mostly of historical interest. The resultant sequences (and 

40 their complements) are provided herein, and the sequences, or any portion thereof, could be prepared 
using synthetic methods, or by a combination of synthetic methods with retrieval of partial sequences using 
methods similar to those described herein. 



45 Preparation of Viral Polypeptides and Fragments 

The availability of HCV cDNA sequences, or nucleotide sequences derived therefrom (including 
segments and modifications of the sequence), permits the construction of expression vectors encoding 
antigenicaily active regions of the polypeptide encoded in either strand. These antigenically active regions 

so may be derived from coat or envelope antigens or from core antigens, or from antigens which are non- 
structural including, for example, polynucleotide binding proteins, polynucleotide polymerase(s), and other 
viral proteins required for the replication and/or assembly of the virus particle. Fragments encoding the 
desired polypeptides are derived from the cDNA clones using conventional restriction digestion or by 
synthetic methods, and are ligated into vectors which may. for example, contain portions of fusion 

55 sequences such as beta-galactosidase or superoxide dismutase (SOD), preferably SOD. Methods and 
vectors which are useful for the production of polypeptides which contain fusion sequences of SOD are 
described in European Patent Office Publication number 0196056, published October 1. 1986. Vectors for 
the expression of fusion polypeptides of SOD and HCV polypeptides encoded in a number of HCV clones 
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are described infra., in the Examples. Any desired portion of the HCV cDNA containing an open reading 
frame, in either sense strand, can be obtained as a recombinant polypeptide, such as a mature or fusion 
protein; alternatively, a polypeptide encoded in the cDNA can be provided by chemica! synthesis. 

The DNA encoding the desired polypeptide, whether in fused or mature form, and whether or not 
containing a signal sequence to permit secretion, may be ligated into expression vectors suitable for any 
convenient host. Both eukaryotic and prokaryotic host systems are presently used in forming recombinant 
polypeptides, and a summary of some of the more common control systems and host cell lines is given 
infra. The polypeptide is then isolated from lysed cells or from the culture medium and purified to the extent 
needed for its intended use. Purification may be by techniques known in the art, for example, differential 
extraction, salt fractionation, chromatography on ion exchange resins, affinity chromatography, centrifuga- 
tion, and the like. See, for example, Methods in Enzymology for a variety of methods for purifying proteins. 
Such polypeptides can be used as diagnostics, or those which give rise to neutralizing antibodies may be 
formulated into vaccines. Antibodies raised against these polypeptides can also be used as diagnostics, or 
for passive immunotherapy. In addition, as discussed infra., antibodies to these polypeptides are useful for 
isolating and identifying HCV particles. 

Preparation of Antigenic Polypeptides and Conjugation with Carrier 

An antigenic region of a polypeptide is generally relatively small-typically 8 to 10 amino acids or less 
in length Fragments of as few as 5 amino acids may characterize an antigenic region. These segments 
may correspond to regions of HCV antigen. Accordingly, using the cDNAs of HCV as a basis, ONAs 
encoding short segments of HCV polypeptides can be expressed recombinantly either as fusion proteins, or 
as isolated polypeptides. In addition, short amino acid sequences can be conveniently obtained by chemical 
synthesis. In instances wherein the synthesized polypeptide is correctly configured so as to provide the 
correct epitope, but is too small to be immunogenic, the polypeptide may be linked to a suitable earner. 

A number of techniques for obtaining such linkage are known in the art, including the formation of 
disulfide linkages using N-succinimidyl-3-(2-pyridyithio)propionate (SPDP) and succinimidyl 4-(N-mal- 
eimidomethyl)cyclohexane-1-carboxylate (SMCC) obtained from Pierce Company, Rockford, Illinois, (if the 
peptide lacks a sulfhydryl group, this can be provided by addition of a cysteine residue.) These reagents 
create a disulfide linkage between themselves and peptide cysteine residues on one protein and an amide 
linkage through the epsilon-amino on a lysine, or other free amino group in the other. A variety of such 
disulfide/amide-forming agents are known. See, for example, Immun. Rev. (1982) 62:185. Other bifunctional 
coupling agents form a thioether rather than a disulfide linkage. Many of these thio-ether-formmg agents are 
commercially available and include reactive esters of 6-maleimidocaproic acid, 2-bromoacetic acid, 2- 
iodoacetic acid, 4-(N-maleimidomethyl)cyciohexane-l-carboxylic acid, and the like. The carboxyl groups can 
be activated by combining them with succinimide or l-hydroxyl-2-nitro-4-sulfonic acid, sodium salt. 
Additional methods of coupling antigens employs the rotavirus/ M binding peptide" system described in EPO 
Pub. No. 259,149, the disclosure of which is incorporated herein by reference. The foregoing list is not 
meant to be exhaustive, and modifications of the named compounds can clearly be used. 

Any carrier may be used which does not itself induce the production of antibodies harmful to the host. 
Suitable carriers are typically large, slowly metabolized macromolecules such as proteins; polysaccharides, 
such as latex functionaiized sepharose, agarose, cellulose, cellulose beads and the like; polymeric ammo 
acids, such as polyglutamic acid, polylysine, and the like; amino acid copolymers; and inactive virus 
particles. Especially useful protein substrates are serum albumins, keyhole limpet hemocyanin, im- 
munoglobulin molecules, thyrogiobulin, ovalbumin, tetanus toxoid, and other proteins well known to those 
skilled in the art. 

In addition to full-length viral proteins, polypeptides comprising truncated HCV amino acid sequences 
encoding at least one viral epitope are useful immunological reagents. For example, polypeptides compris- 
ing such truncated sequences can be used as reagents in an immunoassay. These polypeptides also are 
candidate subunit antigens in compositions for antiserum production or vaccines. While these truncated 
sequences can be produced by various known treatments of native viral protein, it is generally preferred to 
make synthetic or recombinant polypeptides comprising an HCV sequence. Polypeptides comprising these 
truncated HCV sequences can be made up entirely of HCV sequences (one or more epitopes, either 
contiguous or noncontiguous), or HCV sequences and heterologous sequences in a fusion protein. Useful 
heterologous sequences include sequences that provide for secretion from a recombinant host, enhance the 
immunological reactivity of the HCV epitope(s), or facilitate the coupling of the polypeptide to an 
immunoassay support or a vaccine carrier. See, e.g., EPO Pub. No. 116.201; U.S. Pat. No. 4.722,840; EPO 



14 



EP 0 388 232 A1 



Pub. No. 259,149; U.S. Pat No. 4,629,783, the disclosures of which are incorporated herein by reference. 

The size of polypeptides comprising the truncated HCV sequences can vary widely, the minimum size 
being a sequence of sufficient size to provide an HCV epitope, while the maximum size is not critical. For 
convenience, the maximum size usually is not substantially greater than that required to provide the desired 
HCV epitopes and function(s) of the heterologous sequence, if any. Typically, the truncated HCV amino acid 
sequence will range from about 5 to about 100 amino acids in length. More typically, however, the HCV 
sequence will be a maximum of about 50 amino acids in length, preferably a maximum of about 30 amino 
acids. It is usually desirable to select HCV sequences of at least about 10, 12 or 15 amino acids, up to a 
maximum of about 20 or 25 amino acids. 

Truncated HCV amino acid sequences comprising epitopes can be identified in a number of ways. For 
example, the entire viral protein sequence can be screened by preparing a series of short peptides that 
together span the entire protein sequence. An example of antigenic screening of the regions of the HCV 
polyprotein is shown infra. In addition, by starting with, for example, 100mer polypeptides, it would be 
routine to test each polypeptide for the presence of epitope(s) showing a desired reactivity, and then testing 
progressively smaller and overlapping fragments from an identified 100mer to map the epitope of interest. 
Screening such peptides in an immunoassay is within the skill of the art. It is also known to carry out a 
computer analysis of a protein sequence to identify potential epitopes, and then prepare oligopeptides 
comprising the identified regions for screening. Such a computer analysis of the HCV amino acid sequence 
is shown in Fig. 20, where the hydrophilic/hydrophobic character is displayed above the antigen index. The 
amino acids are numbered from the starting MET (position 1) as shown in Fig. 17. It is appreciated by those 
of skill in the art that such computer analysis of antigenicity does not always identify an epitope that 
actually exists, and can also incorrectly identify a region of the protein as containing an epitope. Examples 
of HCV amino acid sequences that may be useful, which are expressed from expression vectors comprised 
of clones 5-1-1. 81, CA74a. 35f, 279a, C36, C33b, CA290a. C8f, Cl2f, 14c, 15e. C25c, C33c, C33f, 33g, 
C39c, C40b, CAl67b are described infra. Other examples of HCV amino acid sequences that may be useful 
as described herein are set forth below. It is to be understood that these peptides do not necessarily 
precisely map one epitope, and may also contain HCV sequence that is not immunogenic. These non- 
immunogenic portions of the sequence can be defined as described above using conventional techniques 
and deleted from the described sequences. Further, additional truncated HCV amino acid sequences that 
comprise an epitope or are immunogenic can be identified as described above. The following sequences 
are given by amino acid number (i.e.. "AAn ") where n is the amino acid number as shown in Fig 17* 
AA1-AA25; AA1-AA50; AA1-AA84; AA9-AA177; AA1-AA10; AA5-AA20; AA20-AA25; AA35-AA45' AA50- 
AA100; AA40-AA90; AA45-AA65; AA65-AA75; AA80-90;" AA99-AA120; AA95-AA110; AA105-AA120' AA100- 
AA150; AA150-AA200; AA155-AA170; AA190-AA210; AA200-AA250; AA220-AA240; AA245-AA265* AA250- 
AA300; AA290-AA330; AA290-305; AA300-AA350; AA310-AA330; AA350-AA400; AA380-AA395*' AA405- 
AA495; AA400-AA450; AA405-AA415; AA415-AA425; AA425-AA435; AA437-AA582; AA450-AA500' AA440- 
AA460; AA460-AA470; AA475-AA495; AA500-AA550; AA511-AA690; AA515-AA550; AA550-AA600; AA550- 
AA625; AA575-AA605; AA5S5-AA600; AA600-AA650; AA600-AA625; AA635-AA665; AA650-AA700- AA645- 
AA680; AA700-AA750; AA700-AA725; AA700-AA750; AA725-AA775; AA770-AA790; AA750-AA800-' AA800- 
AA815; AA825-AA850; AA850-AA875; AA800-AA850; AA920-AA990; AA850-AA900; AA920-AA945- AA940- 
AA965; AA970-AA990. AA950-AA1 000; AA1 000- AA 1060; AA1000-AA1025; AA1 000- AA 1050' AA1025- 
AA1040; AA1040-AA1055; AA1075-AA1 175; AA1050-AA1200; AA1070-AA1 100; AA1 100-AA1 130' AA1140- 
AA1165; AA1192-AA1457; AA1 195-AA1250; AA1 200-AA1 225; AA1 225- AA 1250; AA1250-AA1300- AA1260- 
AA1310; AA1260-AA1280; AA1266-AA1 428; AA1300-AA1350; AA1290-AA1310; AA1310-AA1340' AA1345- 
AA1405; AA1345-AA1365; AA1350-AA1400; AA1365-AA1380; AA1380-AA1405; AA1 400- AA 1450' AA1450- 
AA1500; AA1460-AA1475; AA1 475-AA1 51 5; AA1475-AA1500; AA1 500- AA 1550; AA1500-AA1515 : AA1515- 
AA1550; AA1550-AA1600; AA1 545- AA 1560; AA1569-AA1931 ; AA1570-AA1590; AA1595-AA1610 : AA1590- 
AA1650; AA1610-AA1645; AA1650-AA1690; AA1685-AA1770; AA1689-AA1805; AA1 690- AA 1720* AA1694- 
AA1735; AA1 720- AA 1745; AA1 745-AA1 770; AA1 750- AA 1800; AA1775-AA1810; AA1795-AA1850" AA1850- 
AA1900; AA1900-AA1950;- AA1900-AA1920; AA1916-AA2021 ; AA1920-AA1940: AA1949-AA2124; AA1950- 
AA2000; AA1950-AA1985; AA1980-AA2000; AA2000-AA2050; AA2005-AA2025; AA2020-AA2045* AA2045- 
AA2100; AA2045-AA2070; AA2054-AA2223; AA2070-AA2100; AA2100-AA2150; AA2150-AA220o| AA2200- 
AA2250; AA2200-AA2325; AA2250-AA2330; AA2255-AA2270; AA2265-AA2280; AA2280-AA2290; AA2287- 
AA2385; AA2300-AA2350; AA2290-AA2310; AA2310-AA2330; AA2330-AA2350; AA2350-AA2400;' AA2348- 
AA2464; AA2345-AA2415; AA2345-AA2375; AA2370-AA2410; AA2371-AA2502; AA2400-AA245o! AA2400- 
AA2425; AA2415-AA2450; AA2445-AA2500; AA2445-AA2475; AA2470-AA2490; AA2500-AA2550; AA2505- 
AA2540; AA2535-AA2560; AA2550-AA2600; AA2560-AA2580; AA2600-AA2650; AA2605-AA2620; AA2620- 
AA2650; AA2640-AA2660; AA2650-AA2700; AA2655-AA2670; AA2670-AA2700; AA2700-AA275o! AA2740- 
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HBV antigenic sites from competition with the HCV epitope. 



Preparation of Vaccines 

Vaccines may be prepared from one or more immunogenic polypeptides derived from HCV cDNA, 
including the cDNA sequences described in the Examples. The observed homology between HCV and 
Flaviviruses provides information concerning the polypeptides which may be most effective as vaccines, as 
well as the regions of the genome in which they are encoded. The general structure of the Flavivirus 
genome is discussed in Rice et al (1986). The flavivirus genomic RNA is believed to be the only virus- 
specific mRNA species, and it is translated into the three viral structural proteins, i.e., C, M, and E. as well 
as two large nonstructural proteins, NS4 and NS5, and a complex set of smaller nonstructural proteins. It is 
known that major neutralizing epitopes for Flaviviruses reside in the E (envelope) protein (Roehrig (1986)). 
Thus, vaccines may be comprised of recombinant polypeptides containing epitopes of HCV E. These 
polypeptides may be expressed in bacteria, yeast, or mammalian cells, or alternatively may be isolated 
from viral preparations. It is also anticipated that the other structural proteins may also contain epitopes 
which give rise to protective anti-HCV antibodies. Thus, polypeptides containing the epitopes of E, C, and M 
may also be used, whether singly or in combination, in HCV vaccines. 

In addition to the above, it has been shown that immunization with NS1 (nonstructural protein 1), results 
in protection against yellow fever (Schlesinger et al (1986)). This is true even though the immunization does 
not give rise to neutralizing antibodies. Thus, particularly since this protein appears to be highly conserved 
among Flaviviruses, it is likely that HCV NS1 will also be protective against HCV infection. Moreover, it also 
shows that nonstructural proteins may provide protection against viral pathogenicity, even if they do not 
cause the production of neutralizing antibodies. 

The information provided in the Examples concerning the immunogenicity of the polypeptides ex- 
pressed from cloned HCV cDNAs which span the various regions of the HCV ORF also allows predictions 
concerning their use in vaccines. 

In view of the above, multivalent vaccines against HCV may be comprised of one or more epitopes 
from one or more structural proteins, and/or one or more epitopes from one or more nonstructural proteins. 
These vaccines may be comprised of. for example, recombinant HCV polypeptides and/or polypeptides 
isolated from the virions. In particular, vaccines are contemplated comprising one or more of the following 
HCV proteins, or subunit antigens derived therefrom: E, NS1, C, NS2, NS3, NS4 and NS5. Particularly 
preferred are vaccines comprising E and/or NS1, or subunits thereof. 

The preparation of vaccines which contain an immunogenic polypeptide(s) as active ingredients, is 
known to one skilled in the art. Typically, such vaccines are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may 
also be prepared. The preparation may also be emulsified, or the protein encapsulated in liposomes. The 
active immunogenic ingredients are often mixed with excipients which are pharmaceutical^ acceptable and 
compatible with the active ingredient. Suitable excipients are, for example, water, saline, dextrose, glycerol, 
ethanol, or the like and combinations thereof. In addition, if desired, the vaccine may contain minor amounts 
of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, and/or adjuvants which 
enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective include but are not 
limited to: aluminum hydroxide. N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP). N-acetyl-nor- 
muramyl-L-aianyl-D-isogiutamme (CGP 11637, referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D- 
isoglutaminyl-L-alanine-2-(l -2'-dipalmitoyl-sn-glycero-3-hydroxyphosphory(oxy)-ethylamine (CGP 19835A, 
referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, mon- 
ophosphoryl lipid A, trehalose dimycolate and cell wall skeleton (MPL + TDM + CWS) in a 2% 
squaiene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by measuring the 
amount of antibodies directed against an immunogenic polypeptide containing an HCV antigenic sequence 
resulting from administration of this polypeptide in vaccines which are also comprised of the various 
adjuvants. 

The vaccines are conventionally administered parenterally. by injection, for example, either sub- 
cutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration 
include suppositories and, in some cases, oral formulations. For suppositories, traditional binders and 
carriers may include, for example, poiyalkylene glycols or triglycerides: such suppositories may be formed 
from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1%-2%. Oral 
formulations include such normally employed excipients as. for example, pharmaceutical grades of 
mannitol, lactose, starch, magnesium stearate. sodium saccharine, cellulose, magnesium carbonate, and the 
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Dosage and Administration of Vaccines 

The vaccines are administered in a manner compatible with the dosage formulation, and in such 

2.^STSSrSo^ active ingredient required to be administered may depend on the judgment of 

with other immunoregulatory agents, for example, immune globulins. 
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30 Preparation of Antibodies Against HCV Epitopes 

The immunogenic polypeptides prepared as described above ^^^^"^^ 

SC^^^noafflnfty chromatography. Techniques for producing and processing polyclonal 
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osef7in^a 9 n„s°r and those which are neutralizing are useful in passive immunotherapy. Monoclonal 

antihnriiP<5 in narticuiar mav be used to raise anti-idiotype antibodies. 

Wtt, 2S53^!S«2?-™ immunoglobulins which carry an "interna, image" of the antigen , of the 
infectious agent against which protection is desired. See. for example. Nuonoff. A., et al. (1981) and 

^Tch^s Sor raLg anti-idiotype antibodies are fcnown in the art. ^^^J£% 
MacNamara et al. (1984). and Uytdehaag et al. (1985). These antiidiotype ant.bod.es may also be useful 
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anttgTns" 4 dia9n ° S ' S ° f NANBH ' as we " as for an elucidation of the immunogenic regions of HCV 

It would also be recognized by one of ordinary skill in the art that a variety of types of antibodies 
directed against HCV epitopes may be produced. As used herein, the term "antibody" refers to a 
polypeptide or group of polypeptides which are comprised of at least one antibody combining site An 
antibody combining site" or "binding domain" is formed from the folding of variable domains of an 
antibody molecule(s) to form three-dimensional binding spaces with an internal surface shape and charge 
distribution complementary to the features of an epitope of an antigen, which allows an immunological 
reaction with the antigen. An antibody combining site may be formed from a heavy and/or a light chain 
domain (VH and VL, respectively), which form hypervariable loops which contribute to antigen binding The 
term antibody" includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies altered 
antibodies, univalent antibodies, the Fab proteins, and single domain antibodies 

A "single domain antibody" (dAb) is an antibody which is comprised of an VH domain, which reacts 
immunologically with a designated antigen. A dAB does not contain a VL domain, but may contain other 
antigen binding domains known to exist in antibodies, for example, the kappa and lambda domains 
Methods for preparing dABs are known in the art. See, for example, Ward et al (1989) 

Antibodies may also be comprised of VH and VL domains, as well as other known antigen binding 
domains. Examples of these types of antibodies and methods for their preparation are known in the art (see 
e.g.. U.S. Patent No. 4,816.467, which is incorporated herein by reference), and include the following For 
example, 'vertebrate antibodies" refers to antibodies which are tetramers or aggregates thereof, comprising 
light and heavy chains which are usually aggregated in a "Y" configuration and which may or may not have 
covalent linkages between the chains. In vertebrate antibodies, the amino acid sequences of all the chains 
of a particular antibody are homologous with the chains found in one antibody produced by the lymphocyte 
which produces that antibody in situ, or in vitro (for example, in hybridomas). Vertebrate antibodies 
typical l y include native antibodies, for example, purified polyclonal antibodies and monoclonal antibodies 
Examples of the methods for the preparation of these antibodies are described infra 

"Hybrid antibodies" are antibodies wherein one pair of heavy and light chains is homologous to those in 
a first antibody, while the other pair of heavy and light chains is homologous to those in a different second 
antibody. Typically, each of these two pairs will bind different epitopes, particularly on different antigens 
This results .n the property of "divalence", i.e., the ability to bind two antigens simultaneously. Such hybrids 
may also be formed using chimeric chains, as set forth below. 

"Chimeric antibodies", are antibodies in which the heavy and/or light chains are fusion proteins 
Typically the constant domain of the chains is from one particular species and/or class, and the variable 
domains are from a different species and/or class. Also included is any antibody in which either or both of 
the heavy or light chains are composed of combinations of sequences mimicking the sequences in 
ant.bod.es of different sources, whether these sources be differing classes, or different species of origin 
and whether or not the fusion point is at the variable/constant boundary. Thus, it is possible to oroduce 
ant.bod.es in which neither the constant nor the variable region mimic known antibody sequences It then 
becomes possible, for example, to construct antibodies whose variable region has a higher specific affinity 
for a particular antigen, or whose constant region can elicit enhanced complement fixation, or to make other 
improvements in properties possessed by a particular constant region 

Another example is "altered antibodies", which refers to antibodies in which the naturally occurring 
ammo acid sequence in a vertebrate antibody has been varied. Utilizing recombinant DNA techniques 
antibod.es can be redesigned to obtain desired characteristics. The possible variations are many, and range 
from the changing of one or more amino -acids to the complete redesign of a region, for example, the 
constant region. Changes in the constant region, in general, to attain desired cellular process characteris- 
tics, e.g.. changes in complement fixation, interaction with membranes, and other effector functions 
Changes m the variable region may be made to alter antigen binding characeristics. The antibody may also 
be engineered to aid the specific delivery of a molecule or substance to a specific cell or tissue site The 
desired alterations may be made by known techniques in molecular biology, e.g.. recombinant techniques 
site directed mutagenesis, etc. M 

Yet another example are "univalent antibodies", which are aggregates comprised of a heavy chain/light 
chain dimer bound to the Fc (i.e.. constant) region of a second heavy chain. This type of antibody escapes 
antigenic modulation. See, e.g., Glennie et al. (1982). 

Included also within the definition of antibodies are "Fab" fragments of antibodies. The "Fab" region 
refers to those portions of the heavy and light chains which are roughly equivalent, or analogous to the 
sequences wh.ch comprise the branch portion of the heavy and light chains, and which have been shown to 
exhibit immunolog.cal binding to a specified antigen, but which lack the effector Fc portion "Fab" includes 
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ll.H. Diagnostic Oligonucleotide Probes and Kits 

c o^iatoH Hrv rDNAs as a basis, oligomers of approximately 8 
Using the disclosed portions of the isolated HCV . cDN f J s ^ f h h bridize with the HCV 

nucleotides or more can be prepared, either ^J^^ f ^^^^ t ^ viral genome(s), 
genome and are useful in identification of the ^J.^^ HCV polynucleotides (natural 

as well as in detection of the virus(es) ,n diseased by hybridization. While 6-8 

or derived) are a length which allows the detects of un que v,ral equen^ ^ ^ M and about 20 
nucleotides may be a workable length sequences ; of 10- 2 !^^3 e » a ^ ^ h Iack homogeneity, 
nucleotides appears optimal. Preferably, these ^quences w ■ ^£^<^ oligonucleotide synthetic 
These probes can be prepared us.ng ^™ "^^^2^ newly isolated clones disclosed 
methods. Among useful probes, for example, are *oa« d «r.v«i ™ m tne J A compleme nt to 

Z^^^Z^ as the length of th^tj = . 

For use of such probes as diagnostics, the b-olog-ca ^ ^le to be analyzed, su nuc)ejc ^ ^ ^ 
may be treated, if desired, to extract the nucle.c ac.ds <^ n *J^te«u ' " tecn niques; alternatively, the 
sample may be subjected to gel electrophoresis or »^ ^SjTfTthen labeled. Suitable 
nucleic acid sample may be dot blotted w.thout s.ze «P^- ^ ud P e ° for example, radioactive labels 
,abe,s, and methods for labeling probes are ^ hemiluminescent probes. The 

incorporated by -k trans.at.on or ^'^^^^ , abeled probe under hybridization 
nucleic acids extracted from the sample are wen . J f th be are detected, 

conditions of suitable stringencies, and polynucleo ,de dup taxes ~nto™J9 ™ P Therefore , usually high 

The probes can be made ° alS e posiXe^ However, conditions of high 

Se^:^^ * - and — tion 

of formamide. These factors are outlined in, for example, Man.at.s T. ( 982). 

Generally, it is expected that the HCV 9 en ^ (CID ) per ml. This 

individuals at relatively low levels. Le..j at approxunately ^J0dm^ d i j> ^ ^ 

,evel may require that amplification techn iq ues be used ,n hybr,d_.z^ ^ 

in the art. For example, the Enzo Biochem.ca <%™«™ ^ S ^ oly dT . tailed probe is 

ynucleotide transferase to add unmod.f.ed 3 -PO^™s to ^ DNA P^J. ^ 
hybridized to the target nucleot.de sequence and ^^^^.^^^ea^Uiamte- 
84/03520 and EPA1 24221 describe a DNA hybnd.zat.cn assay nw'c.( «a^« resujtj 
stranded DNA probe that is complementary 

tailed duplex is hybridized to an enzyme-iabeled ?^™ d * EP ^^ such as a P oiy-dT tail, an 
tion assay in which ana.yte DNA is contacted w.th a probe that has a ta-K sue J J ' ^ ^ 
amplifier strand that has a sequence that ^^^^^S^ Technique may first invo.ve 
which is capable of binding a plurality of labeed strands^ A pa« icu ' ar V approximately 10* 

amplification of the target HCV sequences ,n sera «^™«* ™ n ^ac ionsTpCR) technique 

sequences/ml. This may be accomplished, for examp,e_by JJPjV^JJ^ and by Mull, et al. U.S. 
described which is by Saiki et al. (1986), by MuU.s Paten No. ^ hbridization assay whic h is 
Patent No. 4.683,202. The amplified sequence^ ^^^^J J ays , whic h should detect 
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abeled; alternatively, the probe DNA may be unlabeled and the ingredients for labeling may be included in 

eeded fcTST IT?™? ^ *° ^ ^ ^ packa 9 ed ™**« s and ^ 

«i^ rtCU, ' r hybndl2at, ° n pr0t0C01 - for «*"Pto. standards, as well as instructions for 



Immunoassay and Diagnostic Kits 

th™!°Ll h< ; P 0 ' yP « ptideS t whiCh re3Ct '""nunologically with serum containing HCV antibodies, for example, 
hose detected by the ant>gen,c screening method described infra, in the Examples, as well those derived 

, mthi \ th : l ul ated C '° neS deSCrib6d in the Examp,es ' and co-posites'he/eof and me 
an , bod.es ratsed aga.nst the HCV specific epitopes in these polypeptides, are useful in immunoassays to 

DeSn oHhT ° f antib ° dieS ' " ^ PreSenCS ° f the VirUS 3nd/0r viral anti ^ in bioTgicaTsamp el 
art For V^rZ 0a55a)fS " SUbi6Ct 10 3 9rSat d6al ° f Vari3ti0n ' and 3 variet * of these are kn °-n in the 
SmbLTnn nf ' T ^^T^ ^ °™ ™* Spit0pe; alte ™tively, the immunoassay may use a 
comb.nat.on of v.ral ep.topes der.ved from these sources; these epitopes may be derived from the same or 
from Afferent viral polypeptides, and may be in separate recombinant or na Jal polypepSdes o together in 
the same recomb.nant polypeptides. It may use. for example, a monoclonal antibody directed Towards a 
v.ral ep,tope(s) a combination of moncc.ona. antibodies directed towards epitopes of one v^a L«aen 

ZZsT^TV'T* towards epitopes of different viral a ^ po, ^ onai an «^- 

towards the same v.ral ant,gen. or polyclonal antibodies directed towards different viral antigens Protocols 
Z tor """IS' UP ° n COmpetition ' or direct reacti °"- or sandwich type assays. Protocols may 

tlbeilZ^n , f PPOrtS ' ° r be bV imm ""°P^cipitation. Most assays involve the use of 
or 2 ^ZnL 0 ^ °' ypep,lde - fhe labels ma V be. for example, fluorescent, chemiiuminescent. radioactive. 

UZ TT > ;- t aVS Wh ' Ch amP ' ,fy the Signa ' S from the probe are also known : samples of which are 
assays wh.ch ut.l.ze b.ot.n and avidin. and enzyme-labeled and mediated immunoassays such as ELISA 

assays. 

Some of the antigenic regions of the putative polyprotein have been mapped and identified bv 
screen.ng the ant^genicitiy of bacterial expression products of HCV cDNAs which encode portbns of the 

ofZnCV cDNA^ r P,eS - 0th6r anti9eniC re9, ' 0nS ° f HCV may be deteCted b V expressing the portion^ 
insects J« «\ T e T SS ' 0n SyStemS ' indUdin9 yeaSt SyStems and cellular systems derived from 
msects and vertebrates. In addition, studies giving rise to an antigenicity index and 

^SS^^^^ Pr ° file 9iV6 fiSe t0 informati0n Conc ™° th. prob'abil^of a regions 
The studies on antigenic mapping by expression of HCV cDNAs showed that a number of clones 
contam.ng these cDNAs expressed polypeptides which were immunologically reactive w* serum from 
md.v.duals with NANBH. No single polypeptide was immunologically reactive with all sera Rve oTtnese 
po ypepwes were very immunogenic in that antibodies to the HCV epitopes in these polypeptides we e 
l C e ' h many dlfferent patient sera - al *ough the overlap in detection was not comptete Thus, the 
results on the .mmunogen.c.ty of the polypeptides encoded in the various clones suggest that efficient 
detects systems may include the use of panels of epitopes. The epitopes in fhe pane may be 
constructed into one or multiple polypeptides. P V 

oackal a tp b 'Tnl O nn m 7 Un0 ? 9n i OSiS ^ C ° ntainin9 the a PP r0 P" ate ***** ^agents are constructed by 
packag ng the appropr.ate matenals. including the polypeptides of the invention containing HCV epitopes or 
ant.bod.es d.rected against HCV epitopes in suitable containers, along with the remaining ItgeZ and 
matenals requ.red for the conduct of the assay, as well as a suitable set of assay instructions 



to Vi?a?Ge C nTme ati ° n ~ ~ — ^ ^ "ggg flgbes Derived From cDNA 

used T to n*n f C «H AS ? Ue T inf0rmati0n in the new, V isolate d clones described in the Examples may be 

th ncVTJt r rma '°" ° n SeqU6nCe ° f the HCV 9en0me ' and for identification and isolation of 
he HCV agen . and thus w.ll aid ,n its characterization including the nature of the genome, the structure o 

o addlna? ; and | ,he . H nrt - ° f the anti 9 ens ° f which « * imposed. This information/in tin Tct lead 
av^HCV ?TTTl PT °TA P °' ypeptides derived from HCV genome, and antibodies directed 
aga.nst HCV ep.topes wh.ch would be useful for the diagnosis and/or treatment of HCV caused NANBH 
The cDNA sequence information in the abovementioned clones is useful for the design of probes for the 
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10 



isolation of addition* cDNA seances whic. i are d.ived Jjronr ^^dSS^^ 
genome(s) from which the cDNAs .n clones Ascribed herein and n EP031B^ 2Q Qf more 

.abeled probes containing a sequence c appr ^^.^"^f^, 0 ? the composite HCV 
nucleotides, which are derived from regions jtose to me ^ sequences from HCV cDNA 

cDNA sequence shown in Fig. 17 may be used to isolate ' overiapp 9 ge nome(s) isolated 

libr aries. Alternatively, characterization of the genomic * ^ ^ 9 them during tne 

from purified HCV particles. Methods for Pf*™**^ genomes from 

purification procedure are described herem. infra. ^ c ^^ es for '«J a JJ> U ^ de 9 gcribed in E P 
vi ra , particles are known in the art, and one procedure w ^•^^ d * An example of this 

ss: rj^es sees s^^^ * p^ a nd ^ **. 

l6ih Methods for constructing cDNA libraries are '-"^ 

method for the construction of HCV cDNA libraries ,n lambda-g l 1 " ^ be constructed in 

However. cDNA libraries which are useful for screening with nucl«c aad^Je8 may a so 
other vectors known in the art, for example. lambda-gtlO (Huynh et a.. (1985)). 

Screening for Anti-Viral Agents for HCV 

*al agsnt, which inhibit HOV ,o,*at,°n .and pa-. J" f^lTJthods are known t,, those of 

means, for detecting the agents effect on v,a rep.,ca .o *«n £^£,£2 t ° he amount of viral 
o example, the HCV-polynucleot.de probes ^cribed herein, ™V D example, by hybridization or 

nucleic acid produced in a cell culture. This could ^^^^^^^JJ^ probe. For 
competition hybridization of the infected cell nucleic acids with a labe ed I HCV ' pa y H 
example, also, anti-HCV antibodies may be ; used to ,den .y . , qu annate HCV ^ ^ ^ 

utilizing the immunoassays described herem. In ad * t,on ; ^ nce lT ™ Deptides encode d within the HCV 
antigens in the infected cel. culture by a competition assay. ^^^^^^ HC V polypeptide 
cDNAs described herein are usefu! in these ^P^°" ^ f ThisTabe.ed polypeptide to an 

derived from the HCV cDNA would be labeled and the inh b,t.n of binding of this lab ^. Moreover. 

are 6 ^u^t^ ^ HCV £ - able to replicate in a cel, line 
-y be testec ' * -"^^ 
n^o^^ 

example, anti-sense polynucleotides, etc. „nmni,»mpntarv nucleotide sequence which 

Antisense polynucleotides molecules are comprised ^^^^^L polynucleotides 
allows them to hybridize specifically to designated "^^^^^^ mRfVor may be 
may include, for example, molecule s that w,U block ^"^!^ ^^ ncl ^ 
50 molecules which prevent replication of viral RNA by transcr ptese The V m y be inactive by 

carry agents (non-cova.ently attached ".^^.^^^^Ji polynucleotides which 
causing, for example, scissions in the viral RNA. They may ^° D ™ Antisense molecules which 

enhance and/or are required for viral infectivity, , ^^^^^J^o^ of the HCV 
are to hybridize to HCV derived RNAs may be ^^^^ f ^ }Mm for HCV may be 
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Other types of drugs may be based upon polynucleotides which "mimic" important control regions of 
the HCV genome, and which may be therapeutic due to their interactions with key components of the 
system responsible for viral infectivity or replication. 

5 

General Methods 

The general techniques used in extracting the genome from a virus, preparing and probing a cDNA 
library, sequencing clones, constructing expression vectors, transforming cells, performing immunological 
/o assays such as radioimmunoassays and ELISA assays, for growing cells in culture, and the like are known 
in the art and laboratory manuals are available describing these techniques. However, as a general guide, 
the following sets forth some sources currently available for such procedures, and for materials useful in 
carrying them out. 

Both prokaryotic and eukaryotic host cells may be used for expression of desired coding sequences 

is when appropriate control sequences which are compatible with the designated host are used. Among 
prokaryotic hosts, E. coli is most frequently used. Expression control sequences for prokaryotes include 
promoters, optionally containing operator portions, and ribosome binding sites. Transfer vectors compatible 
with prokaryotic hosts are commonly derived from, for example, pBR322, a plasmid containing operons 
conferring ampicillin and tetracycline resistance, and the various pUC vectors, which "also contain se- 

20 quences conferring antibiotic resistance markers. These markers may be used to obtain successful 
transformants by selection. Commonly used prokaryotic control sequences include the Beta-lactamase 
(penicillinase) and lactose promoter systems (Chang et al. (1977)), the tryptophan (trp) promoter system 
(Goeddel et al. (1980)) and the lambda-derived P L promoter and N gene ribosome binding site (Shimatake 
et al. (1981)) and the hybrid tec promoter (De Boer et al. (1983)) derived from sequences of the trp and lac 

25 UV5 promoters. The foregoing systems are particularly compatible with E. coli; if desired, other prokaryo~tic 
hosts such as strains of Bacillus or Pseudomonas may be used, with corresponding control sequences. 

Eukaryotic hosts include yeast and mammalian cells in culture systems. Saccharomyces cerevisiae and 
Saccharomyces carisbergensis are the most commonly used yeast hosts, and are convenient fungal hosts. 
Yeast compatible vectors carry markers which permit selection of successful transformants by conferring 

30 prototrophy to auxotrophic mutants or resistance to heavy metals on wild-type strains. Yeast compatible 
vectors may employ the 2 micron origin of replication (Broach et al. (1983)), the combination of CEN3 and 
ARS1 or other means for assuring replication, such as sequences which will result in incorporation of an 
appropriate fragment into the host cell genome. Control sequences for yeast vectors are known in the art 
and include promoters for the synthesis of glycolytic enzymes (Hess et al. (1968); Holland et al. (1978)), 

35 including the promoter for 3 phosphoglycerate kinase (Hitzeman (1980)). Terminators may also be included, 
such as those derived from the enolase gene (Holland (1981)). Particularly useful control systems are those 
which comprise the glyceraidehyde-3 phosphate dehydrogenase (GAPDH) promoter or alcohol de- 
hydrogenase (ADH) regulatable promoter, terminators also derived from GAPDH, and if secretion is desired, 
leader sequence from yeast alpha factor. In addition, the transcriptional regulatory region and the transcrip- 

40 tional initiation region which are operably linked may be such that they are not naturally associated in the 
wild-type organism. These systems are described in detail in EPO 120,551, published October 3, 1984; 
EPO 116,201, published August 22, 1984; and EPO 164.556, published December 18, 1985. al! of which are 
assigned to the herein assignee, and are hereby incorporated herein by reference. 

Mammalian cell lines available as hosts for expression are known in the art and include many 

45 immortalized cell lines available from the American Type Culture Collection (ATCC), including HeLa cells, 
Chinese hamster ovary (CHO) cells, baby hamster kidney (BHK) cells, and a number of other cell lines. 
Suitable promoters for mammalian cells are also known in the art and include viral promoters such as that 
from Simian Virus 40 (SV40) (Fiers (1978)), Rous sarcoma virus (RSV), adenovirus (ADV), and bovine 
papilloma virus (BPV). Mammalian cells may also require terminator sequences and poly A addition 

so sequences; enhancer sequences which increase expression may also be included, and sequences which 
cause amplification of the gene may also be desirable. These sequences are known in the art. Vectors 
suitable for replication in mammalian cells may include viral replicons. or sequences which insure 
integration of the appropriate sequences encoding NANBV epitopes into the host genome. 

Transformation may be by any known method for introducing polynucleotides into a host cell, including, 

55 for example packaging the polynucleotide in a virus and transducing a host cell with the virus, and by direct 
uptake of the polynucleotide. The transformation procedure used depends upon the host to be transformed. 
For example, transformation of the E. coli host cells with lambda-gtl 1 containing BB-NANBV sequences is 
discussed in the Example section, infra. Bacterial transformation by direct uptake generally employs 
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treatment with calcium or rubidium chloride (Cohen (1972); Maniatis (1982)). Yeast transformation by d ec 
uptake may be carried out using the method of Hinnen et al. (1978). Mammalian J-ansf orations by d rect 
uptake may be conducted using the calcium phosphate precipitation method of Graham and Van der Eb 
(1978) or the various known modifications thereof. 

Vector construction employs techniques which are known in the art. Site-specific DNA cleavage is 
performed by treating with suitable restriction enzymes under conditions which generally are specified by 
the manufacturer of these commercially available enzymes. In general, about 1 microgram of plasm.d or 
DNA sequence is cleaved by 1 unit of enzyme in about 20 microliters buffer solution by incubat.on of 1-2 hr 
at 37* C After incubation with the restriction enzyme, protein is removed by phenol/chloroform extraction 
and the DNA recovered by precipitation with ethanol. The cleaved fragments may be separated using 
polyacrylamide or agarose gel electrophoresis techniques, according to the general procedures found in 
Methods in Enzymology (1980) 65:499-560. in fho 

Sticky ended cleavage fragments may be blunt ended using E. coli DNA polymerase I (Klenow) in the 
presence of the appropriate deoxynucleotide triphosphates (dNTPs) present in the mixture. Treatment w.th 
S1 nuclease may also be used, resulting in the hydrolysis of any single stranded DNA portions. 

Ligations are carried out using standard buffer and temperature conditions using T4 DNA hgase and 
ATP- sticky end ligations require less ATP and less ligase than blunt end ligations. When vector fragments 
are used as part of a ligation mixture, the vector fragment is often treated with, bactenal alkaline 
phosphatase (BAP) or calf intestinal alkaline phosphatase to remove the 5 -phosphate and thus prevent 
religation of the vector; alternatively, restriction enzyme digestion of unwanted fragments can be used to 

PreV Ugation t m r ixtures are transformed into suitable cloning hosts, such as E. coli, and successful transfor- 
mants selected by, for example, antibiotic resistance, and screened for the correct construction. 

Synthetic oligonucleotides may be prepared using an automated oligonucleotide synthes.zer as de- 
scribed by Warner (1984). If desired the synthetic strands may be labeled with *P by treatment with 
polynucleotide kinase in the presence of ^P-ATP, using standard conditions for the reaction. 

DNA sequences, including those isolated from cDNA libraries, may be modified by known techn.ques. 
including, for example site directed mutagenesis, as described by Zoller (1982). Briefly the DNA to be 
modified is packaged into phage as a single stranded sequence, and converted to a double stranded DNA 
with DNA polymerase using, as a primer, a synthetic oligonucleotide complementary to the portion of the 
DNA to be modified, and having the desired modification included in its own sequence. The resulting 
double stranded DNA is transformed into a phage supporting host bacterium. Cultures of the trans ormed 
bacteria, which contain replications of each strand of the phage, are plated in agar to obtam plaques. 
Theoretically 50% of the new plaques contain phage having the mutated sequence, and the remaining 50 A 
have the original sequence. Replicates of the plaques are hybridized to labeled synthetic probe at 
temperatures and conditions which permit hybridization with the correct strand, but not w.th the unmodified 
sequence The sequences which have been identified by hybridization are recovered and cloned. 

DNA libraries may be probed using the procedure of Grunstein and Hogness (1975). Briefly, in this 
procedure the DNA to be probed is immobilized on nitrocellulose filters, denatured, and prehybrid.zed with 
a buffer containing 0-50% formamide, 0.75 M NaCI, 75 mM Na citrate, 0.02% (wt/v) each of bovine serum 
albumin, polyvinyl pyrollidone, and Ficoll. 50 mM Na Phosphate (pH 6.5), 0.1% SDS and 100 
micrograms/ml carrier denatured DNA. The percentage of formamide in the buffer, as well as the time and 
temperature conditions of the prehybridization and subsequent hybridization steps depends on the strin- 
qency required. Oligomeric probes which require lower stringency conditions are generally used with low 
percentages of formamide, lower temperatures, and longer hybridization times. Probes contam.ng more than 
30 or 40 nucleotides such as those derived from cDNA or genomic sequences generally employ higher 
temperatures, e.g. about 40-42^, and a high percentage, e.g.. 50%, formamide. Following prehybridiza- 
tion 5'-3 2 P-labeled oligonucleotide probe is added to the buffer, and the filters are incubated in this mixture 
under hybridization conditions. After washing, the treated filters are subjected to autoradiography to show 
the location of the hybridized probe; DNA in corresponding locations on the original agar plates is used as 
the source of the desired DNA. . 

For routine vector constructions, ligation mixtures are transformed into E. col. strain HB101 or other 
suitable host, and successful transformants selected by antibiotic resistance or other markers Plasm.ds 
from the transformants are then prepared according to the method of Clewed et al. (1969), usually following 
chloramphenicol amplification (Clewell (1972)). The DNA is isolated and analyzed usually by restnc ion 
enzyme analysis and/or sequencing. Sequencing may be by the dideoxy method of Sanger et al. (1977) as 
further described by Messing et al. (1981), or by the method of Maxam et al. (1980). Problems with band 
compression, which are sometimes observed in GC rich regions, were overcome by use of T- 
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deazoguanosine according to Barr et aL (1986). 

The enzyme-linked immunosorbent assay (ELISA) can be used to measure either antigen or antibody 
concentrations. This method depends upon conjugation of an enzyme to either an antigen or an antibody, 
and uses the bound enzyme activity as a quantitative label. To measure antibody, the known antigen is 
fixed to a solid phase (e.g., a microplate or plastic cup), incubated with test serum dilutions, washed, 
incubated with anti-immunoglobulin labeled with an enzyme, and washed again. Enzymes suitable for 
labeling are known in the art, and include, for example, horseradish peroxidase. Enzyme activity bound to 
the solid phase is measured by adding the specific substrate, and determining product formation or 
substrate utilization colorimetricaily. The enzyme activity bound is a direct function of the amount of anti- 
body bound. 

To measure antigen, a known specific antibody is fixed to the solid phase, the test material containing 
antigen is added, after an incubation the solid phase is washed, and a second enzyme-labeled antibody is 
added. After washing, substrate is added, and enzyme activity is estimated colorimetricaily, and related to 
antigen concentration. 



Examples 



Described below are examples of the present invention which are provided only for illustrative purposes, 
and not to limit the scope of the present invention. In light of the present disclosure, numerous 
embodiments within the scope of the claims will be apparent to those of ordinary skill in the art. 



lsotation and Sequence of Overlapping HCV cDNA Clones 1 3i, 26j, CA59a, CA84a, CA1 56e and CA1 67b 



The clones 13i, 26j, CA59a t CA84a, CA156e and CA167b were isolated from the Iambda-gt11 library 
which contains HCV cDNA (ATCC No. 40394), the preparation of which is described in EPO Pub. No. 
318.216 (published 31 May 1989), and WO 89/04669 (published 1 June 1989). Screening of the library was 
with the probes described infra., using the method described in Huynh (1985). The frequencies with which 
positive clones appeared with the respective probes was about 1 in 50,000. 

The isolation of clone 13i was accomplished using a synthetic probe derived from the sequence of 
clone 12f. The sequence of the probe was: 
5' GAA CGT TGC GAT CTG GAA GAC AGG GAC AGG 3'. 

The isolation of clone 26j was accomplished using a probe derived from the s'-region of clone K9-1. 
The sequence of the probe was: 

5' TAT CAG TTA TGC CAA CGG AAG CGG CCC CGA 3'. 

The isolation procedures for clone I2f and for clone k9-1 (also called K9-1) are described in EPO Pub. 
No. 318,216, and their sequences are shown in Figs. 1 and 2, respectively. The HCV cDNA sequences of 
clones 13i and 26j, are shown in Figs. 4 and 5, respectively. Also shown are the amino acids encoded 
therein, as well as the overlap of clone I3i with clone 12f, and the overlap of clone 26j with cione 13i. The 
sequences for these clones confirmed the sequence of clone K9-1. Clone K9-1 had been isolated from a 
different HCV cDNA library (See EP 0,218,316). 

Clone CA59a was isolated utilizing a probe based upon the sequence of the 5,-region of clone 26j. The 
sequence of this probe was: 

5' CTG GTT AGC AGG GCT TTT CTA TCA CCA CAA 3'. 

A probe derived from the sequence of clone CA59a was used to isolate clone CA84a. The sequence of 
the probe used for this isolation was: 

5' AAG GTC CTG GTA GTG CTG CTG CTA TTT GCC 3'. 

Clone CA156e was isolated using a probe derived from the sequence of clone CA84a. The sequence of 
the probe was: 

5 ACT GGA CGA CGC AAG GTT GCA ATT GCT CTA 3. 

Clone CA167b was isolated using a probe derived from the sequence of clone CA 156e. The sequence 
of the probe was: 

5' TTC GAC GTC ACA TCG ATC TGC TTG TCG GGA 3'. 

The nucleotide sequences of the HCV cDNAs in clones CA59a, CA84a, CA156e. and CA167b, are 
shown Figs. 6, 7, 8, and 9, respectively. The amino acids encoded therein, as well as the overlap with the 
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sequences of relevant clones, are also shown in the Figs. 



10 



Creation of "pi" HCV cDNA Library 



a lihran, nf HCV cDNA the "pi" library, was constructed from the same batch of infectious chimpanzee 
A library of HCV cDNA tne pi .id ' *• c N 403g4) describe d in EPO Pub. No. 

CA59A. The sequence of the primer was: 
5' GGT GAC GTG GGT TTC 3'. 



Isolation and Sequence of Clone pi 14a 
„ n t th» -™- HCV cDNA library described supra., with the probe used to isolate clone CA167b 

No 4U394) in add-on P i14a a.so contains about 250 base pairs of DNA which are upstream of the HCV 
cDNA in clone CA167b. 

. isoJatio^a^Seo^^ojO^ 

Based on the sequence of clone C A1 67b a synthetic, probe was made having the following sequence: 
, iTL^X^l^^ — (ATCC No. 4033,, which yielded Cone 

o S n h r s^nce" of clone CA2tea having the flowing sequence: 

5 ^T^^^^Z^^on cDNA library was made using nucleic acid extracted from the 
^.nlS^^u^in the original Iambda-gt11 cDNA library described above. The pnmer used 
was based on the sequence of clones CA216a and CA290a: 

clone CA290a: 

C^Oa About 300 base-pairs of the ag30a sequence, however, is upstream of the sequence rom c one 

relating translation, as we., as the putative first amino acid of the putat,ve po.ypept.de iy). and downstream 
so amino acids encoded therein. 



Isolation and Sequence of Clone CA20Sa 



Clone CA205a was isolated from .the original lambda gt-1 1 library (ATCC No. 40394), us.ng a synthet.c 
probe derived from the HCV sequence in clone CA290a (Fig. H). The sequence of the probe was: 
5 TCA GAT CGT TGG TGG AGT TTA CTT GTT GCC 3 . 
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The sequence of the HCV cDNA in CA205a. shown in Fig. 13, overlaps with the cDNA sequences in both 
clones ag30a and CA290a. The overlap of the sequence with that of CA290a is shown by the dotted line 
above the sequence (the figure also shows the putative amino acids encoded in this fragment). 

As observed from the HCV cDNA sequences in clones CA205a and ag30a, the putative HCV 
polyprotem appears to begin at the ATG start codon; the HCV sequences in both clones contain an in- 
frame, contiguous double stop codon (TGATAG) forty two nucleotides upstream from this ATG. The HCV 
ORF appears to begin after these stop codons, and to extend for at least 8907 nucleotides (See the 
composite HCV cDNA shown in Fig. 17). 



Isolation and Sequence of Clone 1 8g 



Based on the sequence of clone ag30a (See Fig. 12) and of an overlapping clone from the original 
lambda gt-11 library (ATCC No. 40394), CA230a, a synthetic probe was made having the followinq 
sequence: 

5' CCA TAG TGG TCT GCG GAA CCG GTG AGT ACA 3'. 

Screening of the original Iambda-gt11 HCV cDNA library with the probe yielded clone 18g, the HCV cDNA 
sequence of which is shown in Fig. 14. Also shown in the figure are the overlap with clone ag30a and 
putative polypeptides encoded within the HCV cDNA. 

The cDNA in clone 18g (Cl8g or 18g) overlaps that in clones ag30a and CA205a, described supra The 
sequence of Cl8g also contains the double stop codon region observed in clone ag30a. The polynucleotide 
region upstream of these stop codons presumably represents part of the 5'-region of the HCV genome 
wh.ch may contain short ORFs. and which can be confirmed by direct sequenc ing of the purified HCV 
genome. These putative small encoded peptides may play a regulatory role in translation. The region of the 
HCV genome upstream of that represented by Cl8g can be isolated for sequence analysis using essentially 
the technique described in EPO Pub. No. 318.216 for isolating cDNA sequences upstream of the HCV 
cDNA sequence in clone I2f. Essentially, small synthetic oligonucleotide primers of reverse transcriptase 
wh.ch are based upon the sequence of Cl8g. are synthesized and used to bind to the corresponding 
sequence in HCV genomic RNA. The primer sequences are proximal to the known 5'-terminal of Cl8g but 
sufficiently downstream to allow the design of probe sequences upstream of the primer sequences Known 
standard methods of priming and cloning ar eused. The resulting cDNA libraries are screened with 
sequences upstream of the priming sites (as deduced from the elucidated sequence of Cl8g). The HCV 
genomic RNA is obtained from either plasma or liver samples from individuals with NANBH. Since HCV 
appears to be a Flavi-like virus, the 5'-terminus of the genome may be modified with a "cap" structure It is 
known that Flavivirus genomes contain 5 -terminal "cap" structures. (Yellow Fever virus, Rice et al (1988V 
Dengue virus. Hahn et al (1988); Japanese Encephalitis Virus (1987)). 



Isolation and Sequence of Clones from the beta-HCV cDNA library 



Clones containing cDNA representative of the 3 -terminal region of the HCV genome were isolated from 
a cDNA library constructed from the original infectious chimpanzee plasma pool which was used for the 
creation of the HCV cDNA Iambda-gt11 library (ATCC No. 40394). described in EPO Pub No 318 216 In 
order to create the DNA library. RNA extracted from the plasma was "tailed" with poly rA using poly (rA) 
polymerase, and cDNA was synthesized using oligo(dT), 2 _ 18 as a primer for reverse transcriptase The 
resulting RNArcDNA hybrid was digested with RNAase H, and converted to double stranded HCV cDNA 
The resulting HCV cDNA was cloned into Iambda-gt10. using essentially the technique described in Huynh 
(1985). yielding the beta (or b) HCV cDNA library. The procedures used were as follows. 

An aliquot (12ml) of the plasma was treated with proteinase K. and extracted with an equal volume of 
phenol saturated with 0.05M Tris-CI. pH 7.5. 0.05% (v/v) beta-mercaptoethanoi, 0.1% (w/v) hydrox- 
yquinolone. 1 mM EDTA. The resulting aqueous phase was re-extracted with the phenol mixture, followed 
by 3 extractions with a 1:1 mixture containing phenol and chloroform:isoamyl alcohol (24:1), followed by 2 
extractions with a mixture of chloroform and isoamyl alcohol (1:1). Subsequent to adjustment of the aqueous 
phase to 200 mM with respect to NaCI. nucleic acids in the aqueous phase were precipitated overnight at 
-20 C. with 2.5 volumes of cold absolute ethanol. The precipitates were collected by centrifugation at 
10.000 RPM for 40 min.. washed with 70% ethanol containing 20 mM NaCI, and with 100% cold ethanol. 
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were incubated in a solution contain.ng TMN (50 mM Tn HC, pH 7.9. ^ 9 * sham c }> 
p^cTpitated overnight at -20 * C with 2.5 volumes of ethanol in the presence of 200 mM NaCI. 



Isolation of Clone b5a 



15 



20 



25 



SETS ^,aTe. Tn £V£ IT",. S vn,* p TO « 

1000 base pairs. The 5 -region of this cDNA overlaps clones 35f, 19* 26g. and M 5e It 

terminal sequence the HCV genome. „ (described in fra). 

MO ,™ rr^rr:: o^HSS^" " u ,ATCC N0 ' 

40394). (The original Iambda-gt1 1 library is referred to herein as the C library). 
30 i^a^Sequ^ 

chimpanzee piasma pool which was used ^^^J^}^^^^ SU pra The isolated 
40394) described in EPO Pub. No. 318,216. Isolafon of the HCV ' ^ ^ LscrlSld in Sippel (1973), except 
RNA was tailed at the 3 -end with ATP by E. col. po.y-A ^^^1^^^^^. Th e tailed 

were: 
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40 



45 



Stuffer 


Notl 


SP6 Promoter 


Primer 


AATTC 


GCGGCCGC 


CATACGATTTAGGTGACACTATAGAA 


TlS 



resultant cDNA was subjected to amplification by PCR using two primers: 



Primer 


Sequence 


JH32 (30mer) 
JH11 (20mer) 


ATAGCGGCCGCCCTCGATTGCGAGATCTAC 
AATTCGGGCGGCCGCCATACGA 
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The JH32 primer contained 20 nucleotide sequences hybridizable to the 5-end of the target region in the 
cDNA, with an estimated T ? of 66° C. The JH11 was derived from a portion of the ofigo dT-primer adapter; 
thus, it is specific to the 3'-end of the cDNA with a T m of 64° C. Both primers were designed to have a 
recognition site for the restriction enzyme, Notl, at the s'-end, for use in subsequent cloning of the amplified 
5 HCV cDNA. 

The PCR reaction was carried out by suspending the cDNA and the primers in 100 microliters of 
reaction mixture containing the four deoxynucleoside triphosphates, buffer salts and metal ions, and a 
thermostable DNA polymerase isolated from Thermus aquaticus (Taq polymerase), which are in a Perkin 
Elmer Cetus PCR kit (N801-0043 or N801-0055). The PCR reaction was performed for 35 cycles in a Perkin 

70 Elmer Cetus DNA thermal cycler. Each cycle consisted of a 1 .5 min denaturation step at 94 * C, an 
annealing step at 60° C for 2 min, and a primer extension step at 72* C for 3 min. The PCR products were 
subjected to Southern blot analysis using a 30 nucleotide probe, JH34, the sequence of which was based 
upon that of the 3 -terminal region of clone 15e. The sequence of JH34 is: 
5' CTT GAT CTA CCT CCA ATC ATT CAA AG A CTC 3. 

is The PCR products detected by the HCV cDNA probe ranged in size from about 50 to about 400 base pairs. 
In order to clone the amplified HCV cDNA, the PCR products were cleaved with Notl and size selected 
by polyacryfamide gel electrophoresis. DNA larger than 300 base pairs was cloned into the Notl site of 
PUC18S The vector pUCl8S is constructed by including a Notl polylinker cloned between the EcoRI and 
Sail sites of pUC18. The clones were screened for HCV cDNA using the JH34 probe. A number of positive 

20 clones were obtained and sequenced. The nucleotide sequence of the HCV cDNA insert in one of these 
clones, 16jh, and the amino acids encoded therein, are shown in Fig. 15. A nucleotide heterogeneity, 
detected in the sequence of the HCV cDNA in clone I6jh as compared to another clone of this region, is 
indicated in the figure. 

25 

Compiled HCV cDNA Sequences 



An HCV cDNA sequence has been compiled from a series of overlapping clones derived from the 

30 various HCV cONA libraries described supra.. In this sequence, the compiled HCV cDNA sequence 
obtained from clones b114a, 18g, ag30a, CA205a. CA290a, CA216a, pi14a, CA167b, CA156e, CA84a, and 
CA59a is upstream of the compiled HCV cDNA sequence published in EPO Pub. No. 318,216, which is 
shown in Fig. 16. The compiled HCV cDNA sequence obtained from clones b5a and 16jh downstream of 
the compiled HCV cDNA sequence published in EPO Pub. No. 318,216. 

35 Fig. 17 shows the compiled HCV cDNA sequence derived from the above-described clones and the 
compiled HCV cDNA sequence published in EPO Pub. No. 318,216. The clones from which the sequence 
was derived are b!14a, 18g, ag30a, CA205a, CA290a, CA216a, pi14a. CA167b, CA156e, CA84a, CA59a, 
K9-1 (also called k9-1),26j. 13i. I2f, 14i, lib. 7f, 7e, 8h, 33c, 40b, 37b, 35. 36, 81. 32, 33b. 25c, 14c, 8f, 33f, 
33g, 39c, 35f, 19g, 26g, 15e, b5a, and 16jh. In the figure the three dashes above the sequence indicate the 

40 position of the putative initiator methionine codon. 

Clone b114a was obtained using the cloning procedure described for clone b5a, supra., except that the 
probe was the synthetic probe used to detect clone 18g, supra. Clone b114a overlaps with clones I8g, 
ag30a, and CA205a, except that clone bl 14a contains an extra two nucleotides upstream of the sequence in 
clone 18g (i.e., 5 -CA). These extra two nucleotides have been included in the HCV genomic sequence 

45 shown in Fig. 17. 

It should be noted that although several of the clones described supra, have been obtained from 
libraries other than the original HCV cDNA Iambda-gt1 1 C library (ATCC No. 40394), these clones contain 
HCV cDNA sequences which overlap HCV cDNA sequences in the original library. Thus, essentially all of 
the HCV sequence is derivable from the original Iambda-gt1 1 C library (ATCC No. 40394) which was used 
so to isolate the first HCV cDNA clone (5-1-1). The isolation of clone 5-1-1 is described in EPO Pub No 
318,216. 



Purification of Fusion polypeptide C 100-3 (Alternate method) 

55 

The fusion polypeptide, C100-3 (also called HCV c100-3 and alternatively, c100-3), is comprised of 
superoxide dismutase (SOD) at the N-terminus an in-frame C100 HCV polypeptide at the C-terminus. A 
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method for preparing the polypeptide by expression in yeast, and differential extraction of the insoluble 
taction of the extracted host yeast cells, is described in EPO Pub. No. 318,216. An alternate method for 
the preparation of this fusion polypeptide is described below. In this method the antigen is preaprtatec I from 
the crude cell lysate with acetone; the acetone precipitated antigen is then subjected to ,on-exchange 
chromatography, and further purified by gel filtration. 

The fusion polypeptide, C100-3 (HCV c100-3), is expressed in yeast strain JSC 308 (ATCC No. 20879) 
transformed with pAB24C100-3 (ATCC No. 67976); the transformed yeast are grown under conditions which 
aLTex^ression (i e., by growth in YEP containing 1% glucose). (See EPO Pub. No. 318*18). A ceU lysate 
TprelZ by suspending the cell, in Buffer A (20 mM Tris HCI. pH 8.0. 1 mM EDTA. 1 mM PMSF^The 
cells are broken by grinding with glass beads in a Dynomill type homogenizer or its equivalent. The extent 
of celt breakage is monitored by counting cells under a microscope with phase optics. Broken cells appear 
dark, while viable cells are light-colored. The percentage of broken cells is determined^ 

When the percentage of broken cells is approximately 90% or greater, the broken cell deb r,s 
seoarated from the glass beads by centrifugation, and the glass beads are washed w.th Buffer A. After 
Z Sng the m washe 9 s and homojnate. the insoluble materia, in the lysate ^^^^^ 
The material in the pellet is washed to remove soluble proteins by suspension ,n Buffer B (50 mM Iglycme. 
D H 12 0 1 mM DTT 500 mM NaCI), followed by Buffer C (50 mM glycine, pH 10.0. 1 mM DTT). The 
insoluble material is recovered by centrifugation, and solubilized by suspension in Buffer C conteinrnj |SD& 
The extract solution may be heated in the presence of beta-mercaptoethanol and concentrated by 
ultrafiltration. The HCV d00-3 in the extract is precipitated with cold acetone. If desired, the precipitate may 
be stored at temperatures at about or below -15 C. 

Prior to ion exchange chromatography, the acetone precipitated matenal is recovered by «n^fuga*on. 
and may be dried under nitrogen. The precipitate is suspended in Buffer D (50 mM glycine, pH 0.0. 1 mM 
DTT 7 M urea), and centrifuged to pellet insoluble material. The supernatant material .s applied to an anion 
exchange column previously equilibrated with Buffer D. Fractions are collected ^^^Jj*^ 
absorbance or gel electrophoresis on SDS polyacrylamide gels. Those fract.ons containing the HCV c100-3 

POl ^ P n eP o!der a ;rpu 0 r!fy'the HCV C100-3 polypeptide by gel filtration, the pooled fractions from the ion- 
exchange column are heated in the presence of beta-mercaptoethanol and SDS, and the eluate is 
concentrated by ultrafiltration. The concentrate is applied to a gel filtration column previously equ.l.b a ed 
"ft Buffer E (20 mM Tris HCI, P H 7.0. 1 mM DTT, 0.1% SDS). The presence of HCV d00-3 ,n the e.uted 
fractions, as well as the presence of impurities, are determined by gel electrophoresis on polyacrylamide 
gels in the presence of SDS and visualization of the polypeptides. Those fractions containing purified HCV 
C100-3 are pooled. Fractions high in HCV c100-3 may be further purified by repeating he gel filtration 
process. If the removal of particulate material is desired, the HCV c100-3 containing matenal may be filtered 
through a 0.22 micron filter. 

Expression and Antigenicity of Polypeptides Encoded in HCV cDNA 
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Polypeptides Expressed in E. cojii 

The polypeptides encoded in a number of HCV cDNAs which span the HCV genomic ORF were 
expressed in E. coll. and tested for their antigenicity using serum obtained from a variety of individuals with 
NANBH The-ex^Fession vectors containing the cloned HCV cDNAs were constructed from pSOOcfl 
(Steimer et at. (1986). In order to be certain that a correct reading frame would be achieved three separate 
expression vectors. pcflAB, pcflCD. and pcflEf were created by ligating either of three linkers AB CD 
and EF to a BamHI-EcoRI fragment derived by digesting to completion the vector pSODcfl w.th EcoRI and 
BamHI. followed by treatment with alkaline phosphatase. The linkers were created from six oligomers, A B 
C D E and F. Each oligomer was phosphorylated by treatment with kinase in the presence of ATP pr.or to 
annealing to its complementary oligomer. The sequences of the synthetic linkers were the following. 
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DNA Sequence ( 5 ' to 3 ' ) 



A GATC CTG AAT TCC TGA TAA 

B GAC TTA AGG ACT ATT TTA A 

C GATC CGA ATT CTG TGA TAA 

D GCT TAA GAC ACT ATT TTA A 

E GATC CTG GAA TTC TGA TAA 

F GAC CTT AAG ACT ATT TTA A 



Each of the three linkers destroys the original EcoRI site, and creates a new EcoRI site within the linker, but 
within a different reading frame. Hence, the HCV cDNA EcoRI fragments isolated from the clones when 
inserted into the expression vector, were in three different reading frames. 

The HCV cDNA fragments in the designated Iambda-gt1 1 clones were excised by digestion with EcoRI; 
each fragment was inserted into pcflAB. pcflCD, and pcflEF. These expression constructs were then 
transformed into D1210 E. coli cells, the transformants were cloned, and recombinant bacteria from each 
clone were induced to express the fusion polypeptides by growing the bacteria in the presence of IPTG. 

Expression products of the indicated HCV cDNAs were tested for antigenicity by direct immunological 
screening of the colonies, using a modification of the method described in Helfman et al. (1983). Briefly, as 
shown in Fig. 18, the bacteria were plated onto nitrocellulose filters overlaid on ampicillin plates to give 
approximately 1,000 colonies per filter. Colonies were replica plated onto nitrocellulose filters, and the 
replicas were regrown overnight in the presence of 2 mM IPTG and ampicillin. The bacterial colonies were 
lysed by suspending the nitrocellulose filters for about 15 to 20 min in an atmosphere saturated with CHCb 
vapor. Each filter then was placed in an individual 100 mm Petri dish containing 10 ml of 50 mM Tris HCI, 
pH 7.5. 150 mM IMaCl. 5 mM MgCI 2l 3% (w/v) BSA, 40 micrograms/ml lysozyme, and 0.1 microgram/ml 
DNase. The plates were agitated gently for at least 8 hours at room temperature. The filters were rinsed in 
TBST (50 mM Tris HCI, pH8.0, 150 mM NaCI, 0.005% Tween 20). After incubation, the ceil residues were 
rinsed and incubated in TBS (TBST without Tween) containing 10% sheep serum; incubation was for 1 
hour. The filters were then incubated with pretreated sera in TBS from individuals with NANBH, which 
included: 3 chimpanzees; 8 patients with chronic NANBH whose sera were positive with respect to 
antibodies to. HCV C100-3 polypeptide (described in EPO Pub. No. 318,216, and supra.) (also called C100); 
8 patients with chronic NANBH whose sera were negative for anti-ClOO antibodies; a convalescent patient 
whose serum was negative for anti-ClOO antibodies; and 6 patients with community acquired NANBH. 
including one whose sera was strongly positive with respect to anti-ClOO antibodies, and one whose sera 
was marginally positive with respect to anti-ClOO antibodies. The sera, diluted in TBS, was pretreated by 
preabsorption with hSOD. Incubation of the filters with the sera was for at (east two hours. After incubation 
the filters were washed two times for 30 min with TBST. Labeling of expressed proteins to which antibodies 
in the sera bound was accomplished by incubation for 2 hours with ,25 l-labeled sheep anti-human antibody. 
After washing, the filters were washed twice for 30 min with TBST, dried, and autoradiographed. 

A number of clones (see infra.) expressed polypeptides containing HCV epitopes which were im 
munologically reactive with serum from individuals with NANBH. Five of these polypeptides were very 
immunogenic in that antibodies to HCV epitopes in these polypeptides were detected in many different 
patient sera. The clones encoding these polypeptides, and the location of the polypeptide in the putative 
HCV polyprotein (wherein the amino acid numbers begin with the putative initiator codon) are the following: 
clone 5-1-1, amino acids 1694-1735; clone C100, amino acids 1569-1931; clone 33c. amino acids 1192- 
1457; clone CA279a, amino acids 1-84; and clone CA290a amino acids 9-177. The location of the 
immunogenic polypeptides within the putative HCV polyprotein are shown immediately below. 
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Clones encoding polypeptides of proven 
reactivity with sera from NANBH patients. 


Clone 


I nratinn within the HCV 
nnivnrotein 




(amino acid no. beginning with 
putative initiator methionine) 


OA^/ya 


1-84 


CA74a 


'TO / sJQC 


13i 




CA290a 


9-177 


33c 


1 1QP-1457 


40b 


1 9RR-1 4-PR 


5-1-1 




81 


1689-1805 


33b 


1916-2021 


25c 


1949-2124 


14c 


2054-2223 


8f 


2200-3325 


33f 


2287-2385 


33g 


2348-2464 


39c 


2371-2502 


15e 


2796-2886 


C100 


1569-1931 
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The results on the immunogenic^ of the polypeptides encoded in the various clones examined 
suggest efficient detection and immunization systems may include panels of HCV po.ypeot.des/ep.topes. 



Expression of HCV Epitopes in Yeast 

Three different yeast expression vectors which allow the insertion of HCV cDNA into three .different read 
ina frames are constructed. The construction of one of the vectors, P AB24C100-3 .s descr.bed ,n EPO Pub. 
2Ts%l^e studies below, the HCV cDNA from the clones Hated J , supra. ^ 
maooinq study using the E. coli expressed products are subst.tuted for the C1 00 HCV cDNA The 
rnstmction of the other vectors— places the adaptor described in the above E_ coli stud.es w.th one of the 
40 following adaptors: 

Adaptor 1 
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50 



ATT TTG AAT TCC TAA TGA G 

AC TTA AGG ATT ACT CAG CT 

Adaptor 2 

AAT TTG GAA TTC TAA TGA G 
55 AC CTT AAG ATT ACT CAG CT . 

The inserted HCV cDNA is expressed in yeast transformed with the vectors, using the expression 
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conditions described supra, for the expression of the fusion polypeptide, C100-3. The resulting polypeptides 
are screened using the sera from individuals with NANBH, described supra, for the screening of im- 
munogenic polypeptides encoded in HCV cDNAs expressed in E. coli. 



Comparison of the Hydrophobic Profiles of HCV Polyproteins with West Nile Virus Polyprotein and with Dengue 

Virus NS1 



w The hydrophobicity profile of an HCV polyprotein segment was compared with that of a typical 
Flavivirus, West Nile virus. The polypeptide sequence of the West Nile virus polyprotein was deduced from 
the known polynucleotide sequences encoding the non-structural proteins of that virus. The HCV poly- 
protein sequence was deduced from the sequence of overlapping cDNA clones. The profiles were 
determined using an antigen program which uses a window of 7 amino acid width (the amino acid in 

is question, and 3 residues on each side) to report the average hydrophobicity about a given amino acid 
residue. The parameters giving the reactive hydrophobicity for each amino acid residue are from Kyte and 
Doolittie (1982). Fig. 19 shows the hydrophobic profiles of the two polyproteins; the areas corresponding to 
the non-structural proteins of West Nile virus, ns1 through ns5, are indicated in the figure. As seen in the 
figure, there is a general similarity in the profiles of the HCV polyprotein and the 'West Nile virus 

20 polyprotein. 

The sequence of the amino acids encoded in the s'-region of HCV cDNA shown in Fig. 16 has been 
compared with the corresponding region of one of the strains of Dengue virus, described supra., with 
respect to the profile of regions of hydrophobicity and hydrophilicity (data not shown). This comparison 
indicated that the polypeptides from HCV and Dengue encoded in this region, which corresponds to the 
25 region encoding NS1 (or a portion thereof), have a similar hydrophobic/hydrophilic profile. 

The similarity in hydrophobicity profiles, in combination with the previously identified homologies in the 
amino acid sequences of HCV and Dengue Flavivirus in EP 0,218,316 suggests that HCV is related to these 
members of the Flavivirus family. 

30 

Characterization of the Putative Polypeptides Encoded Within the HCV ORF 



The sequence of the HCV cDNA sense strand, shown in Fig. 17, was deduced from the overlapping 
HCV cDNAs in the various clones described in EPO Pub. No. 318,216 and those described supra. It may be 

35 deduced from the sequence that the HCV genome contains primarily one long continuous ORF, which 
encodes a polyprotein. In the sequence, nucleotide number 1 corresponds to the first nucleotide of the 
initiator MET codon; minus numbers indicate that the nucleotides are that distance away in the 5'-direction 
(upstream), while positive numbers indicate that the nucleotides are that distance away in the 3'-direction 
(downstream). The composite sequence shows the "sense" strand of the HCV cDNA. 

40 The amino acid sequence of the putative HCV polyprotein deduced from the HCV cDNA sense strand 
sequence is also shown in Fig. 17, where position 1 begins with the putative initiator methionine. 

Possible protein domains of the encoded HCV polyprotein, as well as the approximate boundaries, are 
the following (the polypeptides identified within the parentheses are those which are encoded in the 
Flavivirus domain): 
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Putative Domain 


Approximate 
Boundary 


(amino acid 
nos.) 


"C" (nucleocapsid protein) 

"E" (Virion envelope protein(s) and possibly matrix (M) proteins 

"NS1 " (complement fixation antigen?) 

"NS2" (unknown function) 

"NS3" (protease?) 

"NS4" (unknown function) 

"NS5" (polymerase) 


1-120 
1 20-400 
400-660 
660-1050 
1050-1640 
1640-2000 
2000-? end 



It should bo noted, however, that hydrophobicity profiles (described infra), indicate that HCV diverges 
from the HaLrus model, particularly with respect to the region upstream of NS2. Moreover, the 
boundaries indicated are not intended to show firm demarcations between the putative polypeptides. 



The Hydrophilic and Antigenic Profile of the Polypeptide 



Profiles of the hvdrophilicity/hydrophobicity and the antigenic index of the putatve Po'VP^*^ 
in the HCV cDNA sequence shown in Fig. 16 were determined by computer analysis. The program for 
hydrJphincity/hydrophobicity was as described supra. The antigenic index results from a computer program 
whicTre es on the following criteria: 1) surface probability, 2) prediction of a.pha-heUcity by ^ Afferent 
methods- 3) prediction of beta-sheet regions by two different methods; 4) predion o U-turns by two 
Sent methods; 5) hvdrophilicity/hydrophobicity; and flexibility. The traces of the profc.es gmntad. by 
Z compter analyses are shown in Fig. 20. In the hydrophilicity profile. def.ect,on above the abscissa 
£dica°es ^hydrophilfcity, and below the abscissa indicates hydrophobicity. The probability that a PofrpepMe 
reqionls antigen^ is usually considered to increase when there is a deflection upward from the abscissa n 
Ze Z^c Z^or antigenic profile. It should be noted, however, that these profi.es are not necessan.y 
indicators of the strength of the immunogenicity of a polypeptide. 



Identification of Co-Hnear Peptides in HCV and Flaviviruses 

The amino acid sequence of the putative polyprotein encoded in the HCV cDNA sense strand was 
compared w^h the known amino acid sequences of several members of Flaviv.ruses The comparison 
sTws that homology is slight, but due to the regions in which it is found, it ,s probably s,gn,f,cant. The 
Served co.'ear'regtons'are shown in Fig. 21. The amino acid numbers listed be.ow the sequences 
represent the number in the putative HCV polyprotein (See Fig. 17 ■> 

The spacing of these conserved motifs is similar between the Flaviviruses and HCV. and impl.es that 
thorp i<? some similarity between HCV and these f.aviviral agents. 

The fo'owmg l ed materials are on deposit under the terms of the Budapest Treaty with the American 
Type Culture Collection (ATCC), 12301 Parklawn Dr., Rockviile, Maryland 20852, and have been assigned 
the following Accession Numbers. 
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lamooa-gTi i 


A TrT^ 

A I UO 
NO. 


Deposit Date 


HCV cDNA library 


40394 


1 Dec. 1987 


clone 81 


40388 


17 Nov. 1987 


clone 91 


40389 


17 Nov. 1987 


clone 1-2 


40390 


17 Nov. 1987 


clone 5-1-1 


40391 


18 Nov. 1987 


clone 12f 


40514 


10 Nov. 1988 


clone 35f 


40511 


10 Nov. 1988 


clone 15e 


40513 


10 Nov. 1988 


clone K9-1 


40512 


10 Nov. 1988 


JSC 308 


20879 


5 May 1988 


pS356 


67683 


29 April 1988 



In addition, the following deposits were made on 11 May 1989. 



Strain 


Linkers 


ATCC 
No. 


D1210 (Cfl/5-1-1) 


EF 


67967 


D1210 (CM/81) 


EF 


67968 


D1210 (Cf1/CA74a) 


EF 


67969 


D1210 (Cf1/35f) 


AB 


67970 


D1 210 (Cf1/279a) 


EF 


67971 


D1210 (Cf1/C36) 


CD 


67972 


D1210 (Cf1/13i) 


AB 


67973 


D1210 (Cf1/C33b) 


EF 


67974 


D1210 (Cf1/CA290a) 


AB 


67975 


HB101 (AB24/C100 #3R) 




67976 



The following derivatives of strain D1210 were deposited on 3 May 1989. 



Strain Derivative 


ATCC 




No. 


pCFlCS/C8f 


67956 


PCF1AB/C12f 


67952 


pCF1 EF/1 4c 


67949 


pCF1EF/1 5e 


67954 


pCF1AB/C25c 


67958 


pCF1EF/C33c 


67953 


pCF1 EF/C33f 


67050 


pCFlCD/33g 


67951 


PCF1CD/C39c 


67955 


pCF1 EF/C40b 


67957 


pCF1EF/CA167b 


67959 



The following strains were deposited on May 1 2, 1 989. 
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Strain 


ATCC 




No. 


Lambda gt1 1 (C35) 


40603 


Lambda gtlO(beta-Sa) 


40602 


D1210 (C40b) 


67980 


D1210 (M16) 


67981 



The deposited materials mentioned herein are intended for convenience 0 Jv-^ " 
practice the present invention in view of the descriptions herein, and m addit,on these matenals are 
incorporated herein by reference. 



Industrial Applicability 



The invention in the various manifestations disclosed herein, has many industrial uses, some of which 
are t^e flwtg The HCV cDNAs may be used for the design of probes for the detect^ HCV nucl , 
acids in samoles The probes derived from the cDNAs may be used to detect HCV nucleic acios in. ror 
examp ie ZTc* synthetic reactions. They may also be used in screening programs for anti-viral agents 
t!: determine eJct of the agents in inhibiting viral replication in cel. culture sys to^ and JJ-^J 
systems The HCV polynucleotide probes are also useful in detecting viral nucle.c ac.ds ,n humans, and 
thus, may serve as a basis for diagnosis of HCV infections in humans. SV nthesizinq 
n addition to the above, the cDNAs provided herein provide information and a means for synthes.z.ng 
ooivpeSdes containing epitopes of HCV. These polypeptides are useful in detecting ant.bod.es to HCV 
L fgens User's" immunoassays for HCV infection, based on recombinant polypept ides 
ep opes are described herein, and will find commercial use in diagnos.ng HCV induce d NANBH m 
lr Sno Wood bank donors for HCV-caused infectious hepatitis, and also for detecting contaminated blood 

agents in animal model systems. In addition, the polypept.des derived from the HCV cDNAs disclosed 
herein will have utility as vaccines for treatment of HCV infections. 

Tne polypeptides derived from the HCV cDNAs, besides the above stated uses, are also useful for 
raisina an«-HCV antibodies. Thus, they may be used in anti-HCV vacc.nes. However, the antibodies 
p^Tuced as a C resu,t of immunization with the HCV polypeptides are also ^^^^SZ 
of viral anticens in samples. Thus, they may be used to assay the production of HCV polypept.des in 
chercaf systems The anti-HCV antibodies may also be used to monitor the efficacy of ant,-v,rai agents in 
scnSn"ngT o rams where these agents are tested in tissue culture systems. They may also be used tor 
ISvTfmSSnXapy. and to diagnose HCV caused NANBH by allowing the detection of viral antgen(s) 
fn bo* b3 donor S and recipients Another important use for anti-HCV antibodies «s .n aff.n.ty chromatog- 
raphy fo*e P uri°caSn of virus and viral polypeptides. The purified virus and viral polypeptide prepara- 
Uons may^e used I in vaccines. However, the purified virus may also be useful for the development of cei. 
culture systems in which HCV replicates. 

Antisense polynucleotides may be used as inhibitors of viral replication. 

^rconvenSnce. the anti-HCV antibodies and HCV po.ypeptides. whether natural or recombinant, may 

be packaged into kits. 



5 



Claims 



1 A recombinant polynucleotide comprising a sequence derived from HCV cDNA. wherein the HCV 
cDNA is n done 131 or clone 26j, or clone 59a. or clone 84a. or clone CA156e, or clone 167b. or done 
pi14a or clot CA216a. or Cone CA290a, or Cone ag30a. or Cone 205, or Cone 1 £ or c tone 1<J* or 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 

F ' 9 ' 2 1? A recombinant polynucleotide according to claim 1. encoding an epitope of HCV. 

3. A recombinant vector comprising the polynucleotide of Cairn 1 or claim 2. 

4. A host cell transformed with the vector of ciaim 3. 
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5. A recombinant expression system comprising an open reading frame (ORF) of DNA derived from the 
recombinant polynucleotide of claim 1 or claim 2, wherein the ORF is operably linked to a control sequence 
compatible with a desired host. 

6. A cell transformed with the recombinant expression system of claim 5. 
s 7. A polypeptide produced by the cell of claim 6. 

8. A purified polypeptide comprising an epitope encoded within HCV cDNA wherein the HCV cDNA is 
of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

9. An immunogenic polypeptide produced by a cell transformed with a recombinant expression vector 
comprising an ORF of DNA derived from HCV cDNA, wherein the HCV cDNA is comprised of a sequence 

10 derived from the HCV cDNA sequence in clone CA279a. or clone CA74a, or clone 13i, or clone CA290a. or 
clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f t or clone 33g, or 
clone 39c, or clone 15e. and wherein the ORF is operably linked to a control sequence compatible with a 
desired host 

10. A peptide comprising an HCV epitope, wherein the peptide is of the formula 

15 A^\)^— ^V^\y, 

wherein x and y designate amino acid numbers shown in Fig. 17, and wherein the peptide is selected from 
the group consisting of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AA177. AA1-AA10, AA5-AA20. AA20-AA25, 
AA35-AA45, AA50-AA100, AA40-AA90, AA45-AA65, AA65-AA75, AA80-90, AA99-AA120. AA95-AA1 10, 
AA105-AA120, AA100-AA150, AA150-AA200, AA155-AA170, AA190-AA210, AA200-AA250, AA220-AA240, 

20 AA245-AA265, AA250-AA300, AA290-AA330, AA290-305, AA300-AA350, AA310-AA330, AA350-AA400, 
AA380-AA395, AA405-AA495, AA400-AA450, AA405-AA415, AA415-AA425, AA425-AA435, AA437-AA582, 
AA450-AA500, AA440-AA460, AA460-AA470, AA475-AA495, AA500-AA550, AA51 1-AA690, AA515-AA550, 
AA550-AA600. AA550-AA625, AA575-AA605. AA585-AA600, AA600-AA650. AA600-AA625, AA635-AA665. 
AA650-AA700, AA645-AA680, AA700-AA750, AA700-AA725, AA700-AA750, AA725-AA775. AA770-AA790, 

25 AA750-AA800, AA800-AA815, AA825-AA850, AA850-AA875, AA800-AA850, AA920-AA990, AA850-AA900, 
AA920-AA945, AA940-AA965, AA970-AA990, AA950-AA1000, AA1000-AA1060, AA1000-AA1025, AA1000- 
AA1050, AA1025-AA1040, AA1040-AA1055, AA1075-AA1 1 75, AA1050-AA1200, AA1070-AA1 100, AA1100- 
AA1130, AA1140-AA1165, AAI1 92-AA1 457, AA1 1 95-AA1 250, AA1 200-AA1225, AA1225-AA1250, AA1250- 
AA1300, AA1260-AA1310, AA1260-AA1280, AA1266-AA1428, AA1 300- AA 1350, AA1290-AA1310, AA1310- 

30 AA1340, AA1345-AA1405. AA1 345- AA 1365, AA1350-AA1400, AA1365-AA1380, AA1380-AA1405, AA1400- 
AA1450, AA1450-AA1500, AA1 460-AA1 475, AA1 475-AA1 51 5, AA1 475-AA1 500, AA1500-AA1550. AA1500- 
AA1515, AA1515-AA1550, AA1 550-AA1 600, AA1 545-AA1 560, AA1 569- AA 1931 , AA1 570-AA1 590, AA1595- 
AA1610, AA1590-AA1650, AA1 610-AA1645, AA 1 650- AA 1690, AA1685-AA1770, AA1689-AA1805, AA1690- 
AA1720, AA1694-AA1735. AA1720-AA1745, AA1745-AA1770, AA1 750- AA 1800, AA1775-AA1810, AA1795- 

35 AA1850, AA1850-AA1900, AA1900-AA1950, AA1 900- AA 1920, AA1916-AA2021 , AA1920-AA1940, AA1949- 
AA2124, AA1950-AA2000, AA1 950-AA1 985, AA1980-AA2000. AA2000-AA2050. AA2005-AA2025, AA2020- 
AA2045, AA2045-AA2100, AA2045-AA2070, AA2054-AA2223, AA2070-AA2100, AA2100-AA2150, AA2150- 
AA2200, AA2200-AA2250, AA2200-AA2325, AA2250-AA2330, AA2255-AA2270, AA2265-AA2280, AA2280- 
AA2290. AA2287-AA2385. AA2300-AA2350, AA2290-AA231 0, AA2310-AA2330. AA2330-AA2350. AA2350- 

40 AA2400, AA2348-AA2464, AA2345-AA241 5, AA2345-AA2375, AA2370-AA2410, AA2371 -AA2502, AA2400- 
AA2450, AA2400-AA2425, AA241 5-AA2450, AA2445-AA2500. AA2445-AA2475, AA2470-AA2490, AA2500- 
AA2550, AA2505-AA2540, AA2535-AA2560, AA2550-AA2600, AA2560-AA2580. AA2600-AA2650, AA2605- 
AA2620, AA2620-AA2650. AA2640-AA2660, AA2650-AA2700, AA2655-AA2670, AA2670-AA2700, AA2700- 
AA2750, AA2740-AA2760. AA2750-AA2800. AA2755-AA2780, AA2780-AA2830, AA2785-AA2810, AA2796- 

45 AA2886, AA2810-AA2825, AA2800-AA2850, AA2850-AA2900, AA2850-AA2865, AA2885-AA2905. AA2900- 
AA2950, AA2910-AA2930, AA2925-AA2950, AA2945-end(C' terminal). 

11. A polypeptide comprised of the peptide of claim 10. 

12. An immunogenic polypeptide attached to a solid substrate, wherein the polypeptide is according to 
claim 7 ? or claim 8, or claim 9. or claim 10, or claim 11, or wherein the polypeptide is comprised of an 

50 epitope encoded within HCV cDNA wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

13. A monoclonal antibody directed against an epitope encoded in HCV cDNA. wherein the HCV cDNA 
is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the 
sequence present in clone 13i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e. or clone 167b, or 

55 clone pi 14a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 16jh. 

14. A preparation of purified polyclonal antibodies directed against a polypeptide comprised of an 
epitope encoded within HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i. or clone 26j, or 
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clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pi14a, or clone CA216a, or clone 
CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 16jh. 

15. A polynucleotide probe for HCV, wherein the probe is comprised of an HCV sequence derived from 
an HCV cDNA sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or from 
the complement of the HCV cDNA sequence. 

16. A kit for analyzing samples for the presence of polynucleotides from HCV comprising a poly- 
nucleotide probe containing a nucleotide sequence of about 8 or more nucleotides, wherein the nucleotide 
sequence is derived from HCV cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 
or 8659 to 8866 in Fig. 17, wherein the polynucleotide probe is in a suitable container. 

17. A kit for analyzing samples for the presence of an HCV antigen comprising an antibody which 
reacts immunologically with an HCV antigen, wherein the antigen contains an epitope encoded within HCV 
cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or 
wherein the HCV cDNA is in clone I3i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 
167b, or clone pi14a, or clone CA2l6a, or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or 
clone I6jh. 

18. A kit for analyzing samples for the presence of an HCV antibody comprising an antigenic 
polypeptide containing an HCV epitope encoded within HCV cDNA which is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is in clone 13i. or clone 26j, or clone 59a, or 
clone 84a, or clone CAl56e, or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone 
ag30a, or clone 205a. or clone I8g, or clone I6jh. 

19. A kit for analyzing samples for the presence of an HCV antibody comprising an antigenic 
polypeptide expressed from HCV cDNA in clone CA279a, or clone CA74a, or clone 13i. or clone CA290a. or 
clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f, or clone 33g, or 
clone 39c, or clone 15e, wherein the antigenic polypeptide is present in a suitable container. 

20. A method for detecting HCV nucleic acids in a sample comprising: 

(a) reacting nucleic acids of the sample with a polynucleotide probe for HCV, wherein the probe is 
comprised of an HCV sequence derived from an HCV cDNA sequence is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, and wherein the reacting is under conditions 
which allow the formation of a polynucleotide duplex between the probe and the HCV nucleic acid from the 
sample, 

(b) detecting a polynucleotide duplex which contains the probe, formed in step (a). 

21. An immunoassay for detecting an HCV antigen comprising: 

(a) incubating a sample suspected of containing an HCV antigen with an antibody directed against an HCV 
epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers 
-319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or clone 26j, or clone 59a, 
or clone 84a, or clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a. or clone CA290a, or clone 
ag30a, or clone 205a, or clone 18g, or clone I6jh, and wherein the incubating is under conditions which 
allow formation of an antigen-antibody complex: and (b) detecting an antibody-antigen complex formed in 
step (a) which contains the antibody. 

22. An immunoassay for detecting antibodies directed against an HCV antigen comprising: 

(a) incubating a sample suspected of containing anti-HCV antibodies with an antigen polypeptide 
containing an epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or 
clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pi 14a, or clone CA2l6a, or 
clone CA290a, or clone ag30a, or clone 205a, or clone I8g, or clone 16jh, and wherein the -incubating is 
under conditions which allow formation of an antigen-antibody complex; and 

(b) detecting an antibody-antigen complex formed in step (a) which contains the antigen polypeptide. 

23. An immunoassay for detecting antibodies directed against an HCV antigen comprising: 

(a) incubating a sample suspected of containing anti-HCV antibodies with the polypeptide of claim 9, 
under conditions which allow formation of an antigen-antibody complex; and 

(b) detecting an antibody-antigen complex formed in step (a) which contains the antigen polypeptide. 

24. A vaccine for treatment of HCV infection comprising an immunogenic polypeptide containing an 
HCV epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17 or is the sequence present in clone 13i, or clone 26j, or 
clone 59a, or clone 84a, or clone CA156e. or clone 167b, or clone pi 14a, or clone CA216a, or clone 
CA290a. or clone ag30a, or clone 205a, or clone l8g, or clone I6jh, and wherein the immunogenic 
polypeptide is present in a pharmacologically effective dose in a pharmaceutical^ acceptable excipient. 

25. A method for producing antibodies to HCV comprising administering to an individual an isolated 
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immunogenic polypeptide containing an HCV epitope encoded in HCV cDNA, wherein the HCV cDNA is of 
a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is of the sequence 
present in clone CA279a t or clone CA74a, or clone 13i, or clone CA290a, or clone 33C or clone 40b, or 
clone 33b, or clone 25c ( or clone 14c, or clone 8f, or clone 33f, or clone 33g, or clone 39c, or clone 15e, 
and wherein the immunogenic polypeptide is present in a pharmacologically effective dose in a pharmaceu- 
tical^ acceptable excipient. 

26. An antisense polynucleotide derived from HCV cDNA, wherein the HCV cDNA is that shown in Fiq. 

17. 

27. A method for preparing purified fusion polypeptide C100-3 comprising: 

(a) providing a crude cell iysate containing polypeptide C100-3, 

(b) treating the crude cell Iysate with an amount of acetone which causes the polypeptide to 
precipitate, 

(c) isolating and solubilizing the precipitated material, 

(d) isolating the C100-3 polypeptide by anion exchange chromatography, and 

(e) further isolating the C100-3 polypeptide of step (d) by gel filtration. 

28. A method for preparing an HCV polypeptide comprising: 

(a) providing a host cell transformed with a recombinant expression system comprising an open 
reading frame (ORF) of DNA derived from HCV cDNA, wherein the HCV cDNA is in clone 13i, or clone 26j, 
or clone 59a, or clone 84a, or clone CAl56e, or clone 167b, or clone pi 14a, or clone CA2l6a, or clone 
CA290a, or clone ag30a, or clone 205a; or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a 
sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, wherein tho ORF is 
operably linked to a control sequence compatible with a desired host: and 

(b) incubating the host cell under conditions with allow expression of the HCV polypeptide. 

29. A method for preparing an immunogenic HCV polypeptide comprising: 

(a) providing a host cell transformed with a recombinant expression vector comprising an ORF of 
DNA derived from HCV cDNA, wherein the HCV cDNA is comprised of a sequence derived from the HCV 
cDNA sequence in clone CA279a, or clone CA74a, or clone 13i, or clone CA290a, or clone 33c, or clone 
40b. or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f, or clone 33g, or clone 39c, or clone 
15e, wherein the OEF is operably linked to a control sequence compatible with the desired host; and 

(b) incubating the host ceil under conditions which allow expression of the HCV polypeptide. 

30. A method for preparing a host ceil transformed with a recombinant polynucleotide comprising a 
sequence of HCV cDNA derived from the HCV cDNA in clone 131, or clone 26j, or clone 59a, or clone 84a, 
or clone CA156e, or clone 167b, or clone pi 14a. or clone CA216a, or clone CA290a, or clone ag30a, or 
clone 205a, or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17 comprising: 

(a) providing a host cell capable of transformation; 

(b) providing the recombinant polynucleotide; and 

(c) incubating (a) with (b) under conditions which allow transformation of the host cell with the 
polynucleotide. 

31. A method for preparing a recombinant polynucleotide comprised of a sequenca of HCV cDNA 
derived from the HCV cDNA in clone 13i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e, or 
clone 167b. or clone pi 14a. or clone CA2l6a, or clone CA290a, or clone ag30a, or clone 205a, or clone I8g, 
or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 
8659 to 8866 in Fig. 17 comprising: 

(a) providing a host cell transformed with the recombinant polynucleotide; and 

(b) isolating said polynucleotide from said host cell. 

32. A method for preparing blood free of HCV comprising: 

(a) providing a sample of blood suspected of containing HCV and anti-HCV antibodies; 

(b) providing an immunogenic polypeptide prepared according to claim 28 or 29; 

(c) incubating the sample of (a) with the immunogenic polypeptide of (b) under conditions which allow 
the formation of antibody-HCV polypeptide complexes; 

(d) detecting the complexes formed in step (c); and 

(e) saving the blood from which complexes were not detected in (d). 

33. A method for preparing blood free of HCV comprising: 

(a) providing nucleic acids from a sample of blood suspected of containing HCV polynucleotides; 

(b) providing a probe for HCV, wherein the probe is comprised of an HCV sequence derived from an 
HCV cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 
17. 
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(c) reacting (a) with (b) under conditions which allow the formation of a polynucleotide duplex 
between the probe and the HCV nucleic acid from the sample; 

(d) detecting a polynucleotide which contains the probe, formed in step (c). and 
(e-) saving the blood from which complexes were not detected in (d). 

34 A me hod for producing a hybridoma which produces anti-HCV monoclonal antibodies 

(a) rrnuni 2in g an individual with an immunogenic polypeptide containing an epitope encoded ,n HCV 
CDNA whe J S Scv cDNA is HCV cDNA in clone 131. or clone 26]. or Cone 59a, or clone 84a or clone 
CA156 or c one 167b, or clone pll4a, or clone CA216a, or clone CA290a, or clone ag30a or clone 205* 
or done I8g or c.one 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleo.de numbers 

70 -319 to 1348 or 8659 to 8866 in Fig. 17; or tn rbim 29- 

(b) immunizing an individual with an immunogenic polypept.de prepared according to claim 29, 

(c) immortalizing antibody producing cells from the immunized individual: 

(d) selecting an immortal cell which produces antibodies which react with an HCV epitope in the 
immunogenic polypeptide of (a) or (b); and 

75 (e) growing said immortal cell. 
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Translation of DNA ' I2f 
TGACCTGCGCCCCGCTTGCAACCCTAGACCTTCTGTCCCrgS^^ST^iS 



TAC 

A 



TGACGACTGGTCATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTG^GGATGGTC^ 

acaggtggccggagtaggtggaggtggtcttgtaIc^ctcc^SSt^ 



accccagttcgtagcgcaggacccggiaattcaccctcatgcagcaagaggacaagg^g 
acgaacgtctgcgcgcgcagacgaggacgaac^ctactacgatgag^atagg^ 



"~77~77~7 7 Overlap with 14 i 



val 
TTGTATC 
AACATAG 
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Translation of ONA >c9-l 

GiyCysProGluAxgLeuAlaSerCysArgProLeuXhrAspPheAspGlnGlyTrpGly 
1 CAGGCXGTCCTGACAGGCTAGCCAGCXGCCGACCCCXXACCGAXXXTGACCAGCGCTGGG 
GXCCCACAGGACTCTCCGATCGGXCGACGCCXGGGGAATGGCXAAAACTGGTCCCGACCC 

ProIleSerXyrAlaAanGlySerGlyProAspGlnArgProTyrCysTrpHisXyrPro 
61 GCCCTATCAGTTATGCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACC 
CGGGAXAGXCAAXACGGXXGCCXXCGCCGGGGCTGGXCGCGGGGAXGACGACCCXGAXGG 

ProLytProCysGlyrieValProMaLysSerValCysGlyProValXyrCysPheXhr 
121 CCCCAAAACCXXGCGGTAXXGXGCCCGCGAAGAGXGXGXGXGGXCCGGXAXAXTGCTXCA 
GGGGXXXXGGAACGCGAXAACACGGGCGCXXCXCACACACACCAGGCCAXAXAACGAAGX 

ProSerProVaiValvalGlyXhrXhrA*pAxgSerGlyAlaProXhrXyrSerXrpGly 
181 CTCCCAGCCCCGTGGTGGTGGGAACGACCG ACAGGTCGGGCGCGCCCACCTACAGCTGGG 
GAGGGXCGGGGCACCACCACCCTXGCXGGCXGXCCAGCCCGCGCGGGXGGAXGTCGACCC 

GluAsnAspXhrAspValPhevalLeuAsnAsnXhrArgProProLeuGlyAinXrpPhe 
241 GXGAAAAXGAXACGGACGXCXXCGXCCTXAACAAXACCAGGCCACCGCXGGGGAAXXGGX 
CACTTTTACTATGCCTGCAGAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCA 

GlyCysThrTrpMetAsnSerThrGiyPheThrLysV^lCysGlyAlaProProCysVal 
301 TCGGTTGXACCTGG ATGAACTCAACTGGAXTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 
AGCCAACATGGACCTACTTCAGTTGACCTAAGTGGrrrCACACGCCTCGCGGAGGAACAC 

HeGlyGlyAlaGlyA$nA5nThrLeuHisCysProThxAspCysPh«ArgLy$HisPro 

3 6 1 TCAXCGGAGGGGCGGGCAACAACACCCTGCACTGCCCCACTGAXTGCTTCCGCAAGCATC 

AGXAGCCXCCCCGCCCGXXGXXGXGGGACGXGACGGGGXGACXAACGAAGGCGTXCGXAG 

AspAlaXhrXyrSerArgCysGlySerGlyProXrpXleXhrProArgCysLeuValAsp 

4 2 1 CGG ACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGC ATCACACCCAGGTGCCTGGTCG 

GCCXGCGGXGXAXGAGAGCCACGCCGAGGCCAGGGACCXAGXGXGGGXCCACGGACCAGC 



XyrProXyrArgLeuXrpHisXyrProCysXhrlleAinXyrXhrllePhelyslleArg 

4 81 ACXACCCGXAXAGGCXXXGGCAXXAXCCXXGXACCAXCAACXACACXAXAXXXAAAAXCA 
- - XGAXGGGCAXAXCCGAAACCGXAAXAGGAACAXGGXAGXXGAXGXGAXAXAAATXXXAGX 



M«tXyTValGlyGlyValGIuHijArgLeuGluAlaAlaCysA«nXrpXhrArgGlyGla 
541 GGAXGXACGXGGGAGGGGXCGAGCACAGGCXGGAAGCXGCCXGCAACXGGACGCGGGGCG 
CCXACAXGCACCCXCCCCAGCXCGXGXCCGACCXXCGACCGACGXXGACCXGCGCCCCGC 



ArgCysAspL«uGluA3pArgAspArgS«rGluLeuSerProL«uL€uLeuXhrXhxXhr 

601 AACGXXGCGAXCXGGAAGAXAGGGACAGGXCCGAGCXCAGCCCGXXACXGCXGACCACXA 
XXGCAACGCXAGACCXXCXAXCCCXGXCCAGGCXCGAGXCGGGCAAXGACGACTGGTGAX 



GlnXrpGlnValL«uProCysS«rPh*XhrXhxt*uProAiaL«uS«rXhrGlyLeuIle 
661 CACAGXGGCAGGXCCXCCCGXGXXCCXXCACAACCCXGCCAGCCXXGXCCACCGGCCXCA 
GXGXCACCGXCCAGGAGGGCACAAGGAAGXGXTGGGACGGXCGGAACAGGXGGCCGGAGX 

—Overlap with Combined ORF o£ DNAj 12f through 15e~ 

HlsLeuHi$GlnA*nIleValA*pValGlnXyrI-euXyrt31yValGlySerSer:ieAIa 
721 TCCACCXCCACCAGAACAXXGXGGACGXGCAGXACXXGXACGGGGXGGGGXCAAGCATCG 
AGGXGGAGGXGGXCXXGXAACACCXGCACGXCAXGAACAXGCCCCACCCCAGXXCGXAGC 



SerXrpAlalleLysXrpGluXyrValVall^uLeuPheLeuLeuLeuAlaAspAiaArg 
7 S 1 CGTCCXGGGCCAXXAAGXGGGAGXACGXCGXCCTCCXGXXCCXXCXGCXXGCAGACC CCC 
GCAGGACCCGGXAAXXCACCCXCAXOCAGCAGGAGGACAAGGAAGACGAACGXCXGCGCG 
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CxCACACGACGACCAACACCTACTACGATGAGTXTAaGGTTCCCCTTCCCCGAAACCTCT 



501 ^ili^^"s^^s^ss^^ils^^^; ,1 



ACA*SAASACI^CGIACe*I»aACTTeCCAITaiCCeKS^TC8re«»TSlS» 



10,1 t=&^?l?§II5l^!^^SI^ISS^^5lS^ 

AccTGTsecieiMcowccttGCKKeseJi^ 



1141 g^^^ssoss^isassi^Egs^sgs&sggasss-* 

CASACACTCSIATAATOTTCGCSATATAOTCSACOttS^ 

ACTGGTCTCACCTTCGCGTTGACGTGCACACCTAAGGGGGGGAGTTGCAGGCTCCCCCCG 

-261 GCGACGCTGTCATCTTACTCATGTGTGCTGTACACCCGACTCTGGTATTTGACATCACCA 
CGCTGCGACAGTAGAATGAGTACACACGACATGTGGGCTGAGACCATAAACTGTAGTGGT 



, »^51fi£i , i«uAl» v '*lPheGlyProLeuTrpIleL«uClnAl* 
13 21 AAaTGCTGCTGGCCCTCTTCGGACCCCTTTGGATTCTTCAAGCCAG 
TTAACGACGACCGGCAGAAGCCTGGGGAAACCTAAGAAGTTCGGTC 
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Translation of DNA 15e 



^^^^^^^^^ 
^^^^^^^^^ 




ProLeuAspLeuProProIlelleGlnArgLeu 
ACCACTTGATCTACCTCCAATCATTCAAACACTC 
TGG TG AACTAG ATGG AGGTTAG TAAG TTTC TG AG 
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Translation of ona 13 i 

' saSSBHHBHBHMHHr 
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Translation of DNA 26 j 

LeuPheTyrHisHisLysPhcAsnSerSerGlyCysProGluArgLeuAlaSerCysArg 
1 GCTTTTCTATCACCACAAGTTCAACTCTTCAGGCTGTCCTGAGAGGCTAGCCAGCTGCCG 
CGAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCGACAGGACTCTCCGATCGGTCGACGGC 

ProLeuThrAspPheAspGlnGlyTrpGlyProIleSerTyrAlaAanGlySerGlyPr© 
61 ACCCCTTACCGATTTTGACCAGGGCTGGGGCCCTATCAGTTATGCCAACGGAAGCGGCCC 
TGGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATACGGTTGCCTTCGCCGGG 

AspGlnArgProTyrCysTrpHisTyrProProLysProCysGlylleValProAlaLys 
121 CGACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGCGGTATTGTGCCCGCGAA 
GCTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTGGAACGCCATAACACGGGCGCTT 

Overlap with 131- — 

SerValCysGlyProValTyrCysPheThrProSerProValValVal 
181 GAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGG 
CTCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCACCACCACCC 
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Translation of DNA CAS9a 

AACCAITACCSAOTCSACSACKeCIASOGIOTICKTAGA^WACTMrelecIcSA 
^^^^^^CCCGTATCSCATAAAGAGGTAC^CCKTTCACKSCTOCCAC 
GACCATCACGACSACCATAAACGGCC^CTGCGCCTCTGGS^^ 

CGGCCGGTGTGACACAGACCTAAACAATCGGAGGACCGTGGTCCGCGGTTCGTCTTGCAG 

gtcgactagttgtggttgccgtcaaccgtggagttatcgtgccgggac^aIgttaIS 



TCGGAGTTGTGGCCGACCAACCGTCCCGAAAAGATAG^ 

— Overlap with 26j 

Overlap with K9-1 

SZ£f£2S luAr 9LeuAlaSerCysArgPro 

TGTCCTGAGAGGCTAGCCAGCTGCCGACCCC 

ACAQ3ACTCTGCGATCGGTCGACGGCTGGGG 
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Translation of DNA CA84a 



GCGTTCCAACGTTAACGAGATAGATAGGGCCgSa^ 



TATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTACCGAGTCGACGAG^GCCTA^ 



ctgttcggtagaacctgtactagcgaccacgagtgawct^^ 



Overlap with CA59a 



AspAlaGluThrHiaValThrGly 
TCGACGCGGAAACCCACGTCACCGGGG 
AGCTGCGCCTTTGGGTGCAGTGGCCCC 
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Translation of dna CA156e 

maaBBammsBsm 



M«tAl^-«~z~r~r.":?y? rl *p cA84a- 



LeuArglleProGlnAla 
GCTCCGGATCCCACAAGCC 
CGAGG.CC.TAGGGTGTTCGG 
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Translation of OKA CA167b 

SerThrGlyLeuTyrHisValThrAsnAspCysProAsnSerSerlleValTyrGluAla 
CTCCACGGGGCTTTACCACGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGC 
GAGGTGCCCCGAAATGGTGCAGTGGTTACTAACGGGATTGAGCTCATAACACATGCTCCG 

AlaAspAlalleLeuHisThrProGlyCysValProCysValArgGluGlyAsnAlaSer 
GGCCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTGAGGGCAACGCCTC 
CCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTGCGGAG 



ArgCysTrpValAlaMetThrProThrValAlaThrArgAspGlyLysLeuProAlaThr 
GAGGTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGAC 
CTCCACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTG 

Overlap with CAlS6e 

GlnLeuArgArgHisIleAspLeuLeuValGlySerAlaThrLeuCysSerAlaLeuTyr 
GCAGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCTACCCTCTGTTCGGCCCTCTA 
CGTCGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGATGGGAGACAAGCCGGGAGAT 

ValGlyAspLeuCysGlySerValPhcLeu 
CGTGGGGGACTTGTGCGGGTCTGTCTTTCTTG 
GCACCCCCTGAACACGCCCAGACAGAAAGAAC 
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**M*l*tiCB Of SKA CJU1«« 




TklCTXTGGaAAXClCCCCQX 



3«1 
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Translation of vox OU90» 
LystyM^taYiArsAttThrAmnAr^ArfP^ 

Glnilev^GlTGlr^tiTyri^ui^uFroAi^ 

tl CTCAGArCGTTWTCCACTTTACTTGTTGCCGOXAGCGGM 
CACTCTACXXKCCACCTCAJ^TCAAOU^ 

ThrAj^y»fhrSerSluArs3erSlnrraAr^lTArgAr«lnrr^ 
121 CgAreaflfow*CTTCCSASCS&TOXJU^ 
OCTGCTCrXTCTtMAOOCrCKC^ 

ArgAwProGluaiyA«Thr?rpAl«ain*rtaai? - iy^ 

181 CTCGTCTCCCCGAGGGCAGdACCTGfcCTCAGCC^ 
C**^COCCCCTCCCCTCCTCa^CSAfl*CC^ 

CluCl7Cy»Gl7?r?JLlA£l7?rpI*u£*uS*rPrcJLr^ 
241 ATGAGGGCWC^TM«^^ 

TACTCCCSAOttCCACCQgCCTACCSACCACACACCCC^ 



P*^hrA«p*r©Ax*Ar7ArsS«ArsA*nIjeuSlyt,Y 
3 0 1 GCCCCACAfiACCCCCGGOGTJU^^GCGCAATTTGGCTA 

CMMWTCTGOGGGCCGCATCCAflCtWCTTAAACCGATTCCACTA^CTATCCCAATCCA 



GlYfnaAltAJpL^UKttSlrrf*!!*'? 1 ^ 

3 6 1 C CfflK gTC TCC CA C CTCATCSGCTlCATJ^C 



— Or«xl*p with CASK*—— 



Ar?Alal*uAimfGljT'alAmtlI*uGluA«p(31 

4 21 CCA<^CCCTOXSCATCCCSTCCCCSTTCT<^^ 

GCTCCCMGACC^TACa^QGC C CAAgACCrtCTOC^^ 



4 fi 1 ACCTTCCTOOTTGCTCTTTCTCTACCTTC 
TG(£AACC*AGCAACGAGAAAfiAGATGGAAfi 



FIGURE 11 



EP 0 388 232 A1 
?**n.i*ti on £ oka «930a 

IMe-tSerValValClaProProClyProProl.au 

FroGlyOluProJW 

GCCCAGCAAAGAACCTAGTTCM^ 
Cy?AM OP AM GI7AI.C7S 

181 CTGCTAGCCGAGTAGTCTTG<WTC<^GAAAGGCCTTCTGGTACTGCCTGATAGOCTGCTT 
GACGATCGGCTCATCACAACCCAGCGCTTTCCM 

GluCya?rcClyAr?SerAr*Arg?roCy*T^ 

241 GCGAGTGCCCCCGGAGGTCTCGTAGACCGTGCACCATGAGCACG^ 
CCCTCACGGGGCCCTCCAGJUttATCra 

Ly^anLy^rgAanThrAanAr^Ar^froGlnAapVAlL/aPhePrcGlytilTGlyGln 



301 AAAAAAACAAACGTAACACCAACCGTCGCCCACAGGACGTCAAGTTCCCGGGTGGOGGIC 
TTTTTTTCTTTCCATTCTGGTTGCCAGCGGCTSTCC7CCA 

Il«v«IGlyClyValTyrI^uI*uPr&Kr?ArsG^^ 



3 61 AGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCCAGATTGGGTGTGCGCGCGA 

TCT A CCAA C CAC C TCAAATCAACAACCCCCCCTCCCCCGGATCTAACCCACACCCCCgCT 

Ar^LyaThrScrGluAr^SerGlnProArtrGlyAr^ArgOlnProIleProLyaAl^Arg 

4 21 CGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCTATCCCCAAGGCTC 

GCTCTTTCTGAAGGCTCGCCACCGTTX^ACCTCCATCTGCACTCC^ATAflacCTTCCCAC 

ArgProGluGlyArgThrTrpAl«GlnProGlyCyr?roTrpProL«uTyrGlyA«nGlu 

Overlap with CA290* — 

481 GTCRGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATG 
•rAGCCCGCCTCCCCTCCTCGACCCCAGTCCGCCCCATGGGAJUrCGGCCACATACCC'rrAC 
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541 >09t?CT0C3G0T^SC0TCATGGCICCTC!TCTCCCCGTGGCTCTCCGCCTACCTCGaCCC 
TfcrA«p£roJ^Arg>ueg3erAi^M^nIieuGlyT^^ 



601 CCACAGACCCCCGGCGTAGGICGCGCAATTTGGGTAA^CTCXTCCATXCCCTTACCTGCG 
GGTGTCWCGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAXTGCACGC 

Phe 



661 GCTTC 
CGAAG 



» - Start or long HCV ORF 

| - Putative firat anino eei4 of large HCV polyprotein 
» - Putative snail encoded peptides ( that may play a 
tranalational regulatory role) 
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Translation of CSX CA205* 



vaXL#ttSlyArvClu te yy r oC r jSlyThrAlaOP AM GlyJUaCyafiluCyaPrefily 



121 AACACCAACCGTCGCCCJUy^ 



vai XTrl^uI^PraArfArffGlYPrQArft^utSlTVa 1 Ar^AiaThrArsiyiTtaSer 
LSI CTTTA C T ra TT G CCSCGa>^^ 

CAAATGAACAAC(X3CQC0TCCCCO90ArCTAACCCAC^^ lCITIlIQ aAOQ 



GiuArgserrsinPrc^goiTiu^ArTSinProiiapraLfkJU^ 

241 CJ^aST CC aUl C CT M lc C ^AOCTCACC C YATCC 

cTCGcauj cQ rioa^scTCCATCTty^ 



MgrhrrrpAlaGlnFroGlyTyrProTrpProLauTTTOlyMnGluGlycyi 

301 ^^f^^^r^^iy^^^^rn 



* - putative initiator aetltlcnlne codon 




FTGUHE 13 
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*ProPr©OP 

♦SerpirMetAsnHi55erProVaiArgAsnTyrCysLeuHisAlaCluS«rVaUW 
*I*^iaK±aGlu£erIjeuPr©Cy^luGluI^uI>«uS«rSer^ 
1 CTCCACCAIGAATCACTCCOTGTCAG^^ 
CAOTrCCTAC^ACTCAGGGCACACXCTT^ 



f^tSorVaLValcinPrcProClyProProLauProGlyCluProAM 
MetAlaLeuValCP 

€1 ATGGCGTTAGTATG>£TGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATAGT 
TACCGCAATCATACTCACAGCACGTCSGAGGTCC^ 



121 GGTCTGCGGAACCGGTG AG TACACCG 3 AATTGCCAGG A CG ACCGGGTCCTTTCTTGGATC 
CCAGACGCCTTGGCCACXCAT3tGGCCTTAACGGTCCTGCTGGCCCAGGAAAGAACCTAG 



Overlap with ag30a— ™ 

iMetP rcGl yA*pLeufil yva IProProG laAs pCysAK 

181 AACCC3CTCAATGCC73GAGATTTGGGCGTGCCCXCGCAAGACTOCTAGCCGAGTAGTCT 
TTCCGCGJ^rrACCGArCTCTAAACCCCC^CGGCGC^TTCTGACGATCGCCTCATCA^ 



OP AM GlyAiaCyaGluCyaProGlyArgS*r 

241 TGGGTCGCGAAAGGCCTTGTGGTAClGCCTGArAGGGTGCTTGCGAGTGCCCCGGGAGGT 
ACCCAGCGCTTTCCGGAACACCATGACGGACTAXCCCACGAACGCTCACflGGGCCCTCCA 



ArgArg 



301 CTCGTAGA 
CACCATCT 



♦ « Start of long HCV OPT 

* - Putative small encoded peptides (that may play 

a translational regulatory rol«) . 
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Translation of DNA 16 jh 
rerlap with 15e 



ccccggacgatgaggtatcttggtgacctagatggaggtta^^ 



61 ^ 

GAGTCGCGTAAAAGTGAGGTGTCAATGAGAGGTCCACTTTAATTATCCCAC^C^A^ 

Gly* 

GAGTCTTTTGAACCCCATGGCGGGAACGCTCGAACCTCTGTGGCCCGGGWTCG^GGCG 

181 skssssss^^ 

cgatccgaagaccggtctcctccgtcccgacggtatacaccg^^ 

Al a ValArgThrLysLeuLy s 
241 GCAGTAAGAACAAAGCTCAAAC 

cgtcattcttgtttcgagtttg 

* - nucleotide heterogeneity 
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CC-iSINED ORF OF DNAs pil4a THROUGH lSe 

( pil 4 a/CAl 6 7b/CAl S 6 e/CA8 4 a/CAS 9 a/X 9-1/12 £/l 4 i/1 lb/7 £ /7 e/ 

3h/33c/40b/37b/3S/36/81/32/33b/2Sc/l4c/8ff/33f/33g/39e/ 
35f/19g/26g 6 15e) 

6 i SSESSSS^^ 

CCCATGTATGGCGAGCAGCCGCGGGGAGAACCTCCGCGACGGTC^GGGACCGCSTACCG 

, Y*i^f^*i^5 1 ^ a P G ^y v alA*nTyrAlaThrGlyAanL«uProGlyCyaS«rPha 
121 GTCCGGGTTCTCGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTTCCTCTTTC 
CAGGCCCAAGACCTTCTGCCGCACTTGATACGTTGTCCCTTGGAAGGACCAACGAGAAAG 

AGAXAGAAGGAAGACCGGGACGAGAGAACGAACXGACACGGGCGAAGCCGGAXGGXXCAC 

241 £GC^TC^GG<^CXXXACCACGXC*CC^ 

GCGTTGAGGTGCCCCGAAATGGTGCAGTGGTTACTAACGGGATTGAGCTCATAACACATG 

GluAlaAlaAspAlalleLeuHiaXhr ProGlyCysValProCy sValArgGluGlyAsn 
3 0 1 GAGGCGGCCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTCAGGGCAAC 
CTCCGCCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTG 

3 5 1 GCCTCGAGGTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCC 

, CGGAGCTCCACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGXTTGAGGGG 

AlaT^Glnl^uArgArgHl^ 
421 GCGACGCAGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCC 
CGCTGCGTCGAAGCTGCAGTGTAGCTAGACGAACAGCCCXCGCGGTGGGAGACAAGCCGG 

4 81 CXCXACGXGGGGGACCXAXGCGGGXCXGXCXXXCXXGXCGGCCAACXGXXCACCXXCXCX 

GAGAXGCACCCCCXGGAXACGCCCAGACAGAAAGAACAGCCGGXXGACAAGXGGAAGAGA 

ProArgArgHIsXrpXhrXhxGlnGlyCysAsxiCysSerlleXyrProGlyKlsIleXhr 
541 CCCAGGCGCCACXGGACGACGCAAGGXXGCAAXXGCXCXAXCXAXCCCGGCCAXAXAACG 
GGGXCCGCGGXGACCXGCXGCGXXCCAACGXXAACGAGAXAGAXAGGGCCGGXAXAXXGC 

Gly HlsAxgMetAlaXrpAspMetMetMetAsnTrpSerProXhrXhrAlaLeuValMet 
601 GGXCACCGCAXGGCAXGGGAXAXGAXGAXGAACXGGXCCCCXACGACGGCGXXGGXAAXG 
CCAGXGGCGXACCGXACCCXAXACXACXACXXGACCAGGGGAXGCXGCCGCAACCAXXAC 

AlaGlnt^uLauArglleProGlnAlallel^uAspMctlleAlaGlyAlaHisXrpGly 
661 GCXCAGCXGCXCCGGAXCCCACAAGCCAXCXXGGACAXGAXCGCXGGXGCXCACXGGGGA 
CGAGXCGACGAGGCCXAGGGXGXXCGGXAGAACCXGXACXAGCGACCACGAGXGACCCCX 

valLeuAlaGlyHftAlaXyrPheSerMetValGlyAsnXrpAlaLysValLauValVal 
721 GXCCXGGCGGGCAXAGCGXAXXXCXCCAXGGXGGGGAACXGGGCGAAGGXCCXGGXAGXG 
CAGGACCGCCCGXAXCGCAXAAAGAGGXACCACCCCXXGACCCGCXXCCAGGACCAXCAC 

L«uI^uI^uPheAlaGlyValA3pAlaGluXhrHisValXhrGlyGlyS«rAlaGlyHis 
781 CXGCXGCXAXXXGCCGGCGXCGACGCGGAAACCCACGXCACCGGGGGAAGXGCCGGCCAC 
GACGACGAXAAACGGCCGCAGCXGCGCCXXXGGGXGCAGXGGCCCCCXXCACGGCCGGXG 

r 
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841 



901 



961 



1021 



ACTGTGTCrGGATTOTTAGCCTCCTCGC^ 

«CTGGTTGCCGTCAACCGT«^ 

TGGCCGACCAACCGXCCCGAAAA^ 

TCCCATC<KTCGACC<KT^ 



1081 



1141 



1201 



L261 



1321 



1381 



1441 




CGGTTGCCTTCGCCGGGGCTGGTCGCGGGGATGACGACCGTGAXGGGGGGTTTTGGA^G 

GlylleValProAlaLysSarvalCysGlyProValTyrCysPheThrProSerProVdl 
G "£" G 7 G SS CGCGAA ^^ 

CCATAACACGGGCGCXTCTCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCAC 
GXGGXGGGAACGACCGACAGGXCGGGCGCGCCCACCXA^ 

CACCACCCTTGCTGGCTGXCCAGCCCGCGCGGGTGGATGXCGACCCCACXTTTACTAXGC 

AapValPhcValL^uAjriAanXhrAxgProProLeuGlyAsnXrpPheGlyCysThrC-o 
GACG7CXTCGXCCXXAACAAXACCAGGCCACCGCTGGGCAATTGGXXCGGXTGTACCTGG 
CTGCAGAAGCAGGAATTGXXATGGXCCGGTGGCGACCCGXXAACCAAGCCAACAXGGACC 

TACXTGAGXTGACCXAAGXGGXXXCACACGCCXCGCGGAGGAACACAGXAGCCXCCCCGC 
GlyAsnAsnThrLeuHisCysProThrAspCysPheArgLysHisProAapAlaXhrTyr 
CCGXXGXXGXGGGACGXGACGGGGXGACXAACGAAGGCGXTCGXAGGCCXGCGGXGXAXG 




1501 



1561 



AGAGCCACGCCGAGGCCAGGGACCXAGXGXGGGXCCACGGACCAGCXGAXGGGCATATCC 

LeuTrpHiaTyrProCysXhrlleAsnTyrThxIlePhfttysIleArgMetTyrValGly 
CXTXGGCAXXAXCCXXGXACCAXCAACTACACCATAXXXAAAAXCAGGATGXACGXGGGA 
GAAACCGXAAXAGGAACAXGGXAGXXGATGXGGXAXAAAXTTTAGXCCXACATGCACCCX 

GlyValGluHlsArgLeuGluAlaAlaCysAsnXrpXhrAxgGlyGluArgCysAspLeu 
GGGGXCGAACACAGGCXGGAAGCXGCCXGCAACTGGACGCGGGGCGAACGXTGCGAXCTG 
CCCCAGCXXGXGXCCGACCTXCGACGGACGXTGACCXGCGCCCCGCXTGCAACGCXAGAC 

GluAspAxgAspArgSerGluLauSerProI^uI^uI^uXhrThrXhrGlnXrpGlnVal 
1621 GAAGACAGGGACAGGXCCGAGCXCAGCCCGXXACXGCXGACCACXACACAGTGGCAGGXC 
CXXCXGXCCCXGXCCAGGCXCGAGXCGGGCAAXGACGACXGGTGAXGXGXCACCGXCCAG 

LcuProCysSerPhcThrThrLeuProAlaLeuSerThrGlyLauIleHisLeuHiaGIn 
1631 CXCCCGTGXXCCXTCACAACCCXACCAGCCXTGXCCACCGGCCXCAXCCACCXCCACCAG 
GAGGGCACAAGGAAGXGXXGGGAXGGXCGGAACAGGXGGCCGGAGXAGGTGGAGGXGGXC 

AsnlleValAapValGlnTyrLeuTyrGlyValGlySerSerllftAlaSerTrp Alalia 
AACAXXGXGGACGXGCAGXACXTGXACGGGGXGGGGXCAAGCAXCGCGXCC7GGGCCAXT 
TTGXAACACCXGCACGXCATGAACAXGCCCCACCCCAGXTCGXAGCGCAGGACCCGGXAA 

LysTrpGluTyrValVallAuLeuPhel^uLeutauAlaAapAlaArgValCysSerCys 
AAGXGGGAGXACGXCGXXCXCCXGXTCCXXCXGCXTGCAGACGCGCGCGXCTGCTCCTGC 



1741 



1301 
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TTCACCCTCATGCAGCAAGAGGACAAGGAAGACGAACGTCTGCGCGCGCAGACGAGGACG 
AACACCTACTACGATGAGTATAGGGTTCGC^ 

imi "Kca£ca??c^ 

"ACGTCGTAG^ACCGGCCCTGCGTGCCAG^ 

cgtaccataaacttcccattcacccacgggcctcgccagatgtgg^aagatcck 

20<1 SgjS^ 

G«GAG<»GGACGAGGACAACaa^^ 

2ioi JcS^g^^^ 

cggcgcagcacaccgccacaacaagag^gcccaa^ 

2161 3@E8a^i3BgSSSE^ 

atgttcgcgatatagtcgaccacgaacacca^gaagtcatajSagactgct 

222i ££iSS§K^iK^ 

cgcgttgacgtgcacacctaagggggggagttccaggctc^cc^cgcgctc 

,,«,, ^^^^S^* ValHisPro, ThxLeuValPheAspIleThrLysL«ui«uL«uAla 

2281 SSSgSSESESSSg^ 

2341 iSiliiS^^ 

. C^^GCCTGGGGAAACCTAAGAAGTTCGGTCAAACGAAmCATGGGATGAAACACGCG 

2401 ^^^^^S^^SSSSlSgfflSSi 

CAGGTTCCGGAAGAGGCCAAGACGCGCAATCGCGCCTTCTACTAGCCTCCGGTAATGCAC 

GTTTACCAGTAGTAATTCAATCCCCGCGAATGACCGTGGATACAAATATTCGTAG^ 
^S^ii^^AspTrpAlaHisAsnGlyLeuAr^AsoLeuAlaValAlaValGluProval 

2581 iEES?? 8 ^^ 

CAGAAGAGGGTTTACCTCTGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCA 

CTGTAGTAGTTGCCGAACGGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGT 

„„, ^i^P^^yWetValSerLysGlyTrpArgLeuLeuAlaProIleThrAlaTyrAlaGln 
2701 gccgatggaatggtcxcc»ag<3M^ 

CGGCTACCTTACCAGAGGTTCCCCACCTCCAACGACCGCGGGTAGTGCCGCAIGCGGGTC 
^^Jifff^lyLeuLeuGlyCysIlelleThrSerLeuThrGlyArgAspLysAjnGln 
GTCTGTTCCCCGGAGGATCCCACGTATTAGTGGTCGGATTGACCGGCCCTGtTXTTGGTT 
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CCA<*ACAGTAGGTCTACATATGGTTAcI^ 

3ooi c^sssgg^^ 




- ^^^^^^^^ 

- ^^^^^^^^ 

34.1 gJggSSSS^^ 
™e«Kee«TeAMG»A^^ 

3601 iSiilii#^ s ^^««"«^^™^^™c 

ACACTGCTCACGG TG AGG TGCC TACGGTG T AG<STAGAACCCGTAGCCG TGACAGGAACTG 

GTTCGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGG 

1 STS^^^CCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCT 
CAGTGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGA 

->-,<., £ilf2y^ 1 y L y sAla Ii e P-o^«uGluValIleZ.ysGlyGlyArcHiJl^uIlaPh«Cy» 
3781 TTTTACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGT 

FIGURE 16-4 



EP 0 388 232 A1 



"AA^CCITCCMTACO^AGCIT^TTMITCCCCCCCIC™!^^^ 
4021 ^?2?5*S2!T*i?5*?l»ThrValA 




4081 A^S£c!c?Sg£&^ 



4141 



Lys 




4201 



4441 GCT^Iir?S^r^° ProProSerTr P AspGln« a 



- ^^^^^^^^ 



CTGGTCTACACCTTCACAAACTAAGCCGAG 




:acgcaccagtatcacccgtcccagcagaa<^cgcccttcggccgttag 

4741 ItXSg^IIa^^ 
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- ^^^^^^^^^ 

- ^^^^^^^^ 

- ^^^^^^^^ 




54.1 ^^S^S^SSff^S^SSSS^SS^ 

CACa:CCXCTC5CTACSTCGAC»OT0CA0I0ACSGIAraAaTCm^aTcI«TOa 

3521 ^^^^sssss&issss^s^ssssis^i 

GTCGAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCA 
|"j£Pjrf ^2 As P I1 eTrpAspTrpIleCysGluValL«uSerAapPhcl.7sThrTrp 

GATTTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCC 

5701 TATAAGGGGGTCTGGCGAGTGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAG 
ATATTCCCCCAGACCGCTCACCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTC 

,, e , Jif?^ lyHisValLysAsnG1 y T ^ e ^? rleVa ^lyPrcArgThrCysAr?Asn 
5/61 ATCACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAAC 
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5941 



6001 



6121 



61S1 



6361 



6421 



6481 



TAGrGACCTGTACAGTTTTTGCCCTGCrACTCCTAGCAGCCAGGATCCtGGACGXCCrTG 

MetTrpSerGlyThrPheProIlcAsnAlaTyrThrThrGlyProCysThrProLeuPro 
5821 ATGTGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCT 
XACACCXCACCCTGGAAGGGGXAAXXACGGAXGXGGXGCCCGGGGACAXGGGGGGAAGGA 

AlaProAsnTyrThrPheAlaLeuXrpArgValSerAlaGluGluXyrValGluIleAra 
5881 GCGCCGAACXACACGXXCGCGCXAXGGAGGGXGXCTGCAGAGGAAXAXGXGGAG AXAAGG 
CGCGGCXXGAXGXGCAAGCGCGAXACCXCCCACAGACGXCXCCXXAXACACCXCXAXXCC 

GlnValGlyAspPheHisXyrValXhrGlyMetXhrThrAspAsnLaut-ysCysProCva 
CAGGXGGGGGACTXCCACXACGXGACGGGXAXGACTACXGACAAXCXCAAAXGCCCGXGC 
GXCCACCCCCXGAAGGXGAXGCACXGCCCAXACXGAXGACXGXTAGAGrrXACGGGCACG 

GlnValProSerPrcCluPhePheXhrGluteuAspGlyValArgLeuHisArgPheAla 
^STSS^?SS2S CG ^ TTTT ^^ G ^ TTCGACGG ^ TCCG ^CXACAXAGGXXXGCG 
GXCCAGGGXAGCGGGCXXAAAAAGXGXCXXAACCXGCCCCACGCGGAXGXAXCCAAACGC 

ProProCysLysProLeuLauArgGluGluValSerPheArgValGlytauHisGluXyr 
6061 CCCCCCXGCAAGCCCXXGCXGCGGGAGGAGGXAXCAXXCAGAGXAGGACXCCACGAAXAC 
GGGGGGACGXXCGGGAACGACGCCCXCCXCCAXAGXAAGXCXCAXCCXGAGGXGCXXAXG 

CCGGXAGGGXCGCAAXXACCTXGCGAGCCCGAACCGGACGXGGCCGTGXTGACGXCCAXG 
GGCCAXCCCAGCGXXAAXGGAACGCXCGGGCXXGGCCXGCACCGGCACAACXGCAGGXAC 

LauXhxAspProSerHiJileXhrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySer 
CXCACXGAXCCCXCCCAXAXAACAGCAGAGGCGGCCGGGCGAAGGXXGGCGAGGGGAXCA 
GAGXGACXAGGGAGGGXAXAXXGXCGXCXCCGCCGGCCCGCXXCCAACCGCXCCCCXAGX 

ProProSerValAIaSarSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaXhr 
6 241 CCCCCCTCTGXGGCCAGCXCCXCGGCXAGCCAGCXAXCCGCXCCAXCXCXCAAGGCAACX 
GGGGGGAGACACCGGXCGAGGAGCCGAXCGGXCGAXAGGCGAGGXAGAGAGXXCCGXXGA 

^5 T ^AlaAsnHisAspSarProAspAlaGluLauIleGluAlaAsnLauI.euXrpArg 
6 301 XGCACCGCXAACCAXGACXCCCCXGAXGCXGAGCXCAXAGAGGCCAACCXCCXAXGGAGG 
9 ACGXGGCGAXXGGXACXGAGGGGACXACGACXCGAGXAXCXCCGGXXGGAGGAXACCXCC 

GlnGluMatGlyGlyAsnlleXhrArgvalGluSarGluAsnl-ysvalVallleLauAsp 
CAGGAGAXGGGCGGCAACAXCACCAGGGXXGAGXCAGAAAACAAAGXGGXGAXXCXGGAC 
GXCCXCXACCCGCCGXXGXAGXGGXCCCAACXCAGXCXTTTGXTTCACCACXAAGACCXG 

SerPheAspProI^uValAiaGluGluAspGluArgGluIleSerValProAlaGluIle 
XCCXXCGAXCCGCXXGXGGCGGAGGAGGACGAGCGGGAGAXCXCCGXACCCGCAGAAAXC 
AGGAAGCXAGGCGAACACCGCCXCCXCCXGCXCGCCCXCXAGAGGCAXGGGCGXCXTXAG 

LauArgLysSerAxgArgPheAlaGlnAlaLeuProValXrpAlaArgProAspXyrAsn 
CXGCGGAAGXCXCGGAGAXXCGCCCAGGCCCXC5CCCGXXXGGGCGCGGCCGGACXAXAAC 
GACGCCXXCAGAGCCXCXAAGCGGGXCCGGGACGGGCAAACCCGCGCCGGCCXGAXAIXG 

ProProLauValGluXhrXrpLysLysProAapXyrGluProProValValHisGlyCys 
6 541 CCCCCGCXAGXGGAGACGXGGAAAAAGCCCGACXACGAACCACCXGXGGXCCAXGGCXGX 
GGGGGCGAXCACCXCXGCACCXXXXTCGGGCXGAXGCXXGGXGGACACCAGGXACCGACA 

ProLeuProProProLysSerProProValProProProArgLyaLysArgXhrValVal 
6601 CCGCXXCCACCXCCAAAGXCCCCXCCTGXGCCXCCGCCXCGGAAGAAGCGGACGGXGGXC 
GGCGAAGGXGGAGGXXTCAGGGGAGGACACGGAGGCGGAGCCXTCXXCGCCXGCCACCAG 

LauXhrGluSerXhrLeuSarXhrAlaLauAlaGluLauAlaXhrArgSerPheGlySer . 
6661 CXCACXGAAXCAACCCXAXCXACXGCCXXGGCCGAGCXCGCCACCAGAAGCXTTGGCAGC 
GAGXGACTXAGXXGGGAXAGAXGACGGAACCGGCXCGAGCGGXGGXCXXCGAAACCGXCG 

SarSarXhrSerGlylleXhrGlyAspAsnXhrXhrXJirSerSerGluProAlaProSar 
6721 XCCXCAACXXCCGGCAXXACGGGCGACAAXACGACAACAXCCXCXGAGCCCGCCCCXTCT 
AGGAGXXGAAGGCCGXAAXGCCCGCXGXTAXGCXGXXGXAGGAGACXCGGGCGGGGAAGA 
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6781 g^Sggjg^Jj^ 

- ^^^^^^^^^ 
- ^^^^^^^^^ 

6961 ScccceSS^ 

ATGAGTGGTCCTGTCGCCOUCTTAAMAGCACGTTCS^TCCA^CT^GGGT 

7621 

TGCCTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTC 

1 r$£S TCACCGAGAG GCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGC 
AGGGAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCrCTCTTGACG 

G iX T y rAr< ?ArgCysArgAlaSer<31y.ValI.auThrThrSerC7sGlyAsnThrL«uThr 
'741 GGCTATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACT 
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P 



CCGAXAGCGXCCACGGCGCGCXCGCCGCAXGACXGXXGAXCGACACCAXXGXGGGAGXGA 

CysTyrlleLysAIaAxgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeu 
.7 301 TGCTACAXCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTC 
ACGAXGXAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAG 

ValCysGlyAspAspL«uValVallleCysGluSerAlaGlyValGlnGluA«pAlaAla 
7861 GTGTGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCG 
CACACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGC 

SerL«uArgAlaPheThrGluAlaM*tThrArgTyrS«rAlaProProGlyAspProPro 
7921 AGCCTG AGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCA 
TCGGACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGT 

GlnPro<31uTyrAspL«uGluL«uIleThrSerCysS«rS«rAsnValS«rValAlaHis 
7981 CAACCAG AATACG ACTTGG AGCTCATAACATCATCCTCCTCCAACGTGTCAGTCGCCCAC 
GXXGGXCXXAXGCXGAACCXCGAGXAXXGXAGXACGAGGAGGXXGCACAGXGAGCGGGXG 

AspGlyAlaGlyLysArgValXyrTyrLeuXhrArgAspProXhrThrProLeuAlaArg 
3041 G ACGGCGCXGGAAAGAGGGXCXACXACCXCACCCGXG ACCCXACAACCCCCCXCGCGAGA 
CXGCCGCGACCXXXCXCCCAGAXGAXGGAGXGGGCACXGGGAXGXXGGGGGGAGCGCXCX 

AlaAlaXrpGluXhrAlaArgHisXhrProYalAsnS«rXrpL«uGlyAjnlleIleMet 
3101 GCXGCGXGGGAGACAGCAAGACACACXCCAGXCAAXXCCXGGCXAGGCAACAXAAXCAXG 
CGACGCACCCXCXGXCGXXCXGXGXGAGGXCAGXXAAGGACCGAXCCGXXGXAXXAGXAC 

PheAl*aProXhrL«uXrpAlaArgMetIleLeuMetXhrHi3PhePheSerValLeuIle 
3161 XXXGCCCCCACACTGTGGGCGAGGAXGAXACXGAXGACCCAXXXCXXXAGCGXCCXXAXA 
AAACGGGGGXGXGACACCCGCXCCXACXAXGACXACXGGGXAAAGAAAXCGCAGGAAXAX 

AlaArgAjpGlnL«uGluGlaAlaLeuAspCysGluIleXyrGlyAlaCysXyrSerIle 
3 221 GCCAGGGACCAGCXXGAACAGGCCCXCGATXGCGAGAXCXACGGGGCCXGCXACXCCAXA 
CGGXCCCTGGXCGAACXXGXCCGGGAGCXAACGCXCXAGAXGCCCCGGACGAXGAGGXAX 

GluProLeuA*pI*uProProIleIleGlnAxgLeu ^ 
3281 G AACCACXXGAXCXACCXCCAAXCAXXCAAAG ACXC 
CXXGGXGAACXAGAXGGAGGXXAGXAAGXXXCXGAG 
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-319 CAC TCCkC CATG AAXCACTC C C C TGTGAGG AACTAC TG TC T XCACG CAG AAAG CG TCTAG 
GXGAGGXGGXACXTAGXGAGGGGACACXCCXXGAXGACAGAAGXGCGXCXXXCGCAGAXC 

-259 CCMGGCGTTAGTATGAGXGXCGXGCAGCCTCCAGGACCCC^^ 

GGTACCGCAATCATACTCACAGCACGTCGGAGGTCCTGGGGGGGAGGGCCCTCTCGGTAT 

-199 GTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGA 
CACCAGACGCCT-TGGCCACTCATGTGGCCTTAACGGTCCTGCTGGCCCAGGAAAGAACCT 

-139 XCAACCCGCXCAAXGCCXGGAGAXXXGGGCGXGCCCCCGCAAGACTGCXAGCCGAGXAGX 
AGXTGGGCGAGXXACGGACCXCXAAACCCGCACGGGGGCGXXCXGACGAXCGGCTCAXCA 

- 79 GXTCGGXCGC6AAAGGCCXXGXGGXACXGCCXGATAGGGXGCXXGCGAGXGCCCCGGGAG 

CAACCCAGCGCTXTCCGGAACACCATGACGGACTATCCCACGAACGCTCACGGGGCCCTC 

- 19 GXCTCGXAGACCGXGCACC 

CAGAGCAXCXGGCACGXGG 

Arg Xhr 

MetSerXhrAsnProLysProGlnLysLysAsnLyaAxgAsnThrAsnArgArgProGln 

1 AXGAGCACGAAXCCXAAACCXCAAAAAAAAAACAAACGXAACACCAACCGXCGCCCACAG 
XACTCGXGCXXAGGAXXXGGAGIXXXXXXXXXGXXXGCAXXGXGGXXGGCAGCGGGXGXC 

AaoValLysPheProGlyGlyGlyGlnlleValGlyGlyValXyrLeuLeuProAxgArg 
61 GACGXCAAGXXCCCGGGXGGCGGXCAGAXCGXXGG XGG AGXTTACXTGXXGCCGCGCAGG 
CXGCAGXXCAAGGGCCCACCGCCAGXCXAGCAACCACCXCAAAXGAACAACGGCGCGXCC 

GlyProArgLeuGlyValArgAlaThrArgliysXhrSerGluArgSerGlnProArgGly 
121 GGCCCXAGAXTGGGXGTCCGCGCGACGAGAAAGACXXCCGAGCGGXCGCAACCXCGAGGX 
CCGGGAXCXAAGCCACACGCGCGCTGCXCXXXCTGAAGGCXCGCCAGCGIXGGAGCXCCA 

ArgArgGlnProXl eProLy sAl aAr gAr g ProGluGlyAr g.XhxTrpAl aGln ProGly 
181 AGACGXCAGCCXAXCCCCAAGGCXCGTCGGCCCGAGGGCAGGACCXGGGCXCAGCCCGGG 
TCXGCAGXCGGAXAGGGGXXCCGAGCAGCCGGGCXCCCGXCCTGGACCCGAGXGGGGCCC 

XyrProXrpProLeuXyrGIyAsnGluGlyCysGlyXrpAlaGlyTrpLeuLeuSexPro 
241 XACCCXXGGCCCCXCXATGGCAAXGAGGGCXGCGGGXGGGCGGGATGGCXCCXGTCXCCC 
AXGGGAACCGGGGAGAXACCGXTACTCCCGACGCCCACCCGCCCXACCGAGGACAGAGGG 

ArgGlySerArgProSerTrpGlyProXhrAapProArgArgArgSerArgAsnLeuGly 
301 CGXGGCXCXCGGCCTAGCXGGGGCCCCACAGACCCCCGGCGXAGGXCGCGCAATTXGGGX 
GCACCGAGAGCCGGAXCGACCCCGGGGXGXCXGGGGGCCGCAXCCAGCGCGXXAAACCCA 

LysVallleAspXhrLeuXhrCysGlyPhaAlaAspLeuMetGlyTyrlleProLeuVal 
361 AAGGXCAXCGATACCCXTACGXGCGGCXXCGCCGACCXCAXGGGGXACAXACCGCTCGXC 
XXCCAGXAGCXAXGGGAAXGCACGCCGAAGCGGCXGGAGXACCCCAXGXAXGGCGAGCAG 

GlyAlaProLeuGlyGlyAlaAlaArgAlaLeuAlaHisGlyValArgValLeuGluAsp 
421 GGCGCCCCXCXXGGAGGCGCXGCCAGGGCCCXGGCGCA XGGCGXCCGGGXXC XGG AAGAC 
CCGCGGGGAGAACCXCCGCGACGGXCCCGGGACCGCGXACCGCAGGCCCAAGACCXXCXG 

Xhr 

GlyValAsnTyrAlaXhrGlyAsr^euProGlyCysSerPheSexIlePheieuLeuAla 

481 GGCGXGAACXATGCAACAGGGAACCXXCCXGGXTGCXCXTXCXCXAXCXXCCXXCXGGCC 
CCGCACTXGATACGXTGXCCCXXGGAAGGACCAACGAGAAAGAGATAGAAGGAAGACCGG 

LeuLeuSexCysI^uThrValProAlaSerAlaXyrGlnValArgAsnSerXhrGlyLeu 
541 CXGCXCXCXXGCXXGACXGTGCCCGCXXCGGCCXACCAAGXGCGCAACXCCACGGGGCXX 
GACGAGAGAACGAACXGACACGGGCGAAGCCGGAXGGXXCACGCGXXGAGGXGCCCCGAA 

XyrHisValXhrAsnAspCyaProAsnSerSexIleValTyrGluAlaAlaAspAlalls 
601 T^CCACGXCACCAAXGAIXGCCCXAACXCGAGXAXXGXGXACGAGGCGGCCGATGCCATC 
AXGGTGCAGTGGXXACXAACGGGAXXGAGCXCAXAACACAXGCXCCGCCGGCXACGGTAG 

LeuHlsXhrProGlyCysValProCysValArgGluGlyAsnAlaSexArgCysTrpVal 
651 CXGCACACXCCGGGGXGCGXCCCXXGCGXXCGXGAGGGCAACGCCXCGAGGXGTXGGGXG 
GACGXGXGAGGCCCCACGCAGGGAACGCAAGCACXCCCGXXGCGGAGCXCCACAACCCAC 



fig... 17-1 



EP 0 388 232 A1 



Al^etThrProThrValAlaThrArgAspGlyLysI^euProAlaThrGlnlieuArgArg 
721 GCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGACGGAGCTTCGACGT 
CGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTGCGTCGAAGCTGCA 

HisIleAspl^uXifiuValGlySerAlaThrLAUC^sSerMal^uTyrValGlyAspLau 
781 CACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACGTGGGGGACC!TA 
GTGTAGCXAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGGGAGATGCACCCCCTGGAT 

CysGlySerValPhel^uValGlyGliiL^ 
841 TGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCOTCTCTCCCAGGCGCCACTGGACG 
ACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGAGGGTCCGCGGTGACCTGC 

ThrGlnGlyC^sAsnCysSerlleTyrProGlyHisIleThrGlyHisArgMetAlaTrp 
901 ACGCAAGGTTGGAATTGCTCTATCTATCCCGGCCATATAACGGGTCACCGCATGGCATGG 
TGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGCCCAGTGGCGTACCGTACC 

Val 

AsoMetMetMetAsnTrpSerProThrT^ 
961 GATATGA0K3ATGAACTGGTCCCCTACGACGGCGTTGGTAATGGCTCAGCTGCTCCGGATC 
CTATACTACTACTTGACCAGGGGATGCXGCCGCAACCAMACCGAGTCGACGAGGCCTAG 

PrcKSlnAlallel^uAapMetlleAlaGlyAlaHisTrpGlyValXeuAlaGlylleAla 
1021 CCACAAGCCATCTTGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCATAGCG 
GGTGTTCGGTAGAACCTGTACTAGCGAGCACGAGTGACCCCTCAGGACCGCCCGXATCGC 

1081 



ATAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCACGACGACGATAAACGGCCG 

ValAspAlaGluThrHisValThrGlyGlySerAlaGlyHisThrValSerGlyPheVal 
1141 GTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCACACTGTGTCTGGATTTGTT 
CAGCTGCGCGTTTGGGTGCAGTGGCCCCCTTCACGGCCGGTGTGACACAGACCTAAACAA 

SerLeuLeuAlaProGlyAlaLysGlnAsaValGlnLeuIleAsnThrAsnGlySerTrp 
1201 AGCCTCCTCGCACCAGGCGCCAAGCAGAACGTCCAGCTGATCAACACCAACGGCAGTTGG 
TCGGAGGAGCGTGGTCCGCGGTTCGTCTTGCAGGTCGACTAGTTGTGGTTGCCGICAACC 

HisLeuAsnSerThrAlaLeiiAsnCysAsnAapSerLeuAsnThrGlyTrpLeuAlaGly 
1261 CACCTCAATAGGACGGCCCTGAACTGCAATGATAGCCTCAACACCGGCTGGTTGGCAGGG 
GTGGAGTTATCGTGCCGGGACTTGACGTTACTATCGGAGTTGTGGCCGACCAACCGTCCC 

LeuPheTyrHisHlsLysPheAsnSerSerGlyCyaProGliiAxgreuAlaSerCysArg 
1321 OTTTTCTATCACCACAAGTTCAACTCTTCAGGCTGTCCTGAGAGGCTAGCCAGCTGCCGA 
GAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCGACAGGACTCTCCGATCGGTCGACGGGT 

ProLeuThrAspPheAspGlnGlyTrpGlyProlleSerTyrAlaAsnGlySerGlyPro 
1381 CCCCTTACCGATTTTGACCAGGGCfGGGGCCCTATCAGXTATGCCAACGGAAGCGGCCCC 
GGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATACGGTTGCCTTCGCCGGGG 

AspGlnArgProTyrCysTrpHisTyrProP^^^ 
1441 GAC^GCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGCGGTATTGTGCCCGCG^G 
CTGGTCGCGGGGATGACGACCGTGATGGGGGGMTTGGAACGCCATAACACGGGCGCTTC 

SerValCysGlyProValTyrCysPheThrProSerProValValValGlyThrThrAsp 
1501 AGTGTGTGTGGTCCGGTATAXTGCTTCACTCCCAGCCCCGTGGTGGTGGGAACGACCGAC 

T^^a^CCAGGCCATATAACG^ 

ArcrSexGlvAlaProThrTyrSerTrpGlyGluAsnAspThrAspValPheValLeuAsn 
1561 AGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATACGGACGTCTTCGTCCTTA^ 
TC^GCCCGCGCGGGTGGATGTCGACCCCACTTTTACTATGCCTGCAGAAGCAGGAAXTG 



1621 



TTA 
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ThrLysValCysGlyAlaProProCysVallleGlyGlyAlaGlyAsnAsaThrLeuHis 
1681 ACCAAAGTGTGCGGAGCGCCTCCTTGTCTCATC^ 

TGGTTTCACACGCCTCGCGGAGGAAGACAGTAGCCTCCCCGCCCGTTGTTC 

CysProThrAfipCysPhe^LysHisProAspAlaThrTyrSerArgCysGl^ 
1741 TGCCCCACTGATTGCTTCCGCAAGCATCCGGACGCCACATACTCTCGGTGCGGCTCCGGT 
ACGGGGTGACTAACGAAGGCGTTCGTAGGCCTGCGGTGTATGAGAGCCACGCCGAGGCCA 

Leu 

ProTrpIleThrProArgCyaLeuValAspTyrProTyrArgLeuTrpHisTyrProCys 
1801 CCCTGG ATCACACCCAGGTGCCTGGTCGACTACCCGTATAGGCTTTGGCATTATCCTTGT 
GGGACCTAGTGTGGGTCCACGGACCAGCTGATGGGCATATCCGAAACCGTAATAGGAACA 

ThrlleAsnTyrThrllePheLyslleArgMetTyrValGlyGlyValGluHisArgLeu 
1861 ACCATCAACTACACCATATTTAAAATCAGGATGTACGTGGGAGGGGTCGAACACAGGCTG 
TGGTAGTTGATGTGGTAIAAATTTTAGTCCTACATC 

GluAlaAlaCysAsnTrpThrArgGlyGluArgCysAspLeuGluAspArgAapArgSer 
1921 GAAGCTGCCTGCAACTGGACGCGGGGCGAACGTTGCGATCTGGAAGACAGGGACAGGTCC 
CTTCGACGGACGTTGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGG 

GluI^uSerProI^uLeuI^uThrThrT^^ 
1981 GAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACA 
CTCGAGTCGGGCAATGACGACTGGTGATGTGTC^CCGTCCAGGAGGGCACAAGGAAGTGT 

Thrl^uProAlaLeuSerThrGlyl^ulleHlsI^iiHisGlnAsnlleValAspValGln 
2041 ACCCTACCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAGAACATTGTGGACGTGCAG * 
TGGGATGGTCGGAACAGGTGGCCGGAGTAGGTGGAGGTGGTCTTGTAAGACCTGCAC 

TyrLeuTyrGlyvalGlySerSerllaAlaSeorTrpAlalleLysTrpGluTyrValVal 
2101 TACTTGTACGGGGTGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTT p 
ATGAACATGCCCCACCCCAGTTCGTAGCGCA^ 

I^uLeuPheLeuL^uLeuAlaAspAlaArgValCysSer^ 
2161 CTCCTGTTCCTTCTGCTTGCAGACGCGCGCGTCTGCTCCTGCTTGTGGATGATGCTACTC 
GAGGACAAGGAAGACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAG 

IleSerGlnAlaGluAlaAlaLeuGliiAsnLeuVallleLeuAsnAlaAlaSerLeuAla 
2221 ATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAGCATCCCTGGCC 
XATAGGGTTCGCCTCCGCCGAAACCTCTTGGAGCATTATGAATTACGTCGTAGGGACCGG 

GlyThrHisGlyl^uValSerPheX^uValPhePhe^ 
2281 GGGACGCACGGTCTTGTATCCTTCCTCGTGTTCTTCTGCTTTGCATGGTATTTGAAGGGT 
CCCTGCGTGCCAGAACATAGGAAGGAGCACAAGAAGACGAAACGTACCATAAACTTCCCA 

LysTrpValProGlyAlaValTyrThrPheTyr^^ 
2341 AAGTGGGTGCCCGGAGCGGTCTACACCTTCTACGGGATGTGGCCTCTCCTCCTGCTCCTG 
TTCACCCACGGGCCTCGCCAGATGTGGAAGAT^ 

LeuAlal^uProGlnArgAlaTyrAlal^u^ 
2401 TTGGCGTTGCCCCAGCGGGCGTACGCGCTGG ACACGG AGGTGGCCGCGTCGTGTGGCGGT 
AACCGCAACGGGGTCGCCCGGATGCGCGACCTGTGCCTCCACCGGCGCAGCACACCGCCA 

ValVall^uValGlyl^uMei^aLeuThrLeuSerProTyrTyrLysArgTyxIleSer 
2461 GTTGTTCTCGTCGGGTTGATGGCGCTGACTCTGTCACCATATTACAAGCGCTATATCAGC 
C^CAAGAGCAGCCCAACTACCGCGACTGAGACAGTGGTATAATGTTCGCGATATAGTCG 

Asn 

TrpCysLeuTrpTrpLeuGlnTyrPheLeuThrArgValGluAlaGlnl^uHlsValTrp 
2521 TGGTGCTTGTGGTGGCTTCAGTATTTTCTGACCAGAGTGGAAGCGCAACTGCACGTGTGG 
ACCACGAACACCACCGAAGTCATAAAAGACTGGTCTCACCTTCGCGTTGACGTGCACACC 

IleProProLeuAsnValAxgGlyGlyArgAspAlaVallleLeuLeuMetCyeAlaVal 
2581 ATTCCCCCCCTCAACGTCCGAGGGGGGCGCGACGCCGTCATCTTACTCATGTGTGCTGTA 
TAAGGGGGGGAGTTGCAGGCTCCCCCCGCGCTGCGGCAGTAGAATGAGTACACACGACAT 
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HifiProThrl^uValPheAspileThrLyeLeuLeuLexiAlaValPheGlyProLeuTrp 
2641 CACCCGACTCTGGTATTTGACATCACCAAATTGCTGCTGGCCGTCTTCGGACCCCTTTGG 
GTGGGCTGAGACCATAAACTGTAGTGGTTTAACGACGACCGGCAGAAGCCTGGGGAAACC 

Ilel^uGlnAlaSerlAul^uLysValProTyrPheValArgValGlnGlyLeuLeuArg 
2701 ATTCTTCAAGCCAGTTTGCTTAAAGTACCCTACTTTGTGCGCGTCGAAGGCCTTCTCCGG 
TAAGAAGTTCGGTCAAACGAATTTCATGGGATGAAACACGCGCAGGTTCCGGAAGAGGCC 

PheCysAlaLeuAlaArgLysMetlleGlyGlyHisTyrValGlnMetValllelleLyg 
2761 TTCTGCGCGTTAGCGCGGAAGAZGATCGGAGGCCATTACGTGCAAATGGTCATCATTAAG 
AAGACGCGCAATCGCGCCTTCTACTAGCCTCCGGTAATGCACGTTTACCAGTAGTAATTC 

LeuGlyAlaLeuThrGlyThrTyrValTyrAsnHiaLeuThrProLeuArgAspTrpAla 
2821 TTAGGGGCGCTTACTGGCACCTATGTTTATAACCATCTCACTCCTCTTCGGGACTGGGCG 
AATCCCCGCGAATGACCGTGGATACAAATATTGGTAGAGTGAGGAGAAGCCCTGACCCGC 

HisAsnGlyLeuArgAspLeuAlaValAlaValGluProValValPheSerGlnMetGlu 
2881 CACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTCGTCTTCTCCCAAATGGAG 
GTGTTGCCGAACGCTCTAGACCGGCACCGACATCTCGGTCAGCAGAAGAGGGTTTACCTC 

ThrLysLeuIleThrTrpGlyAlaAspThrAlaAlaCysGlyAapIlelleAsnGlyLeu 
2941 ACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATCAACGGCTTG 
TGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCACTGTAGTAGTTGCCGAAC 

ProValSerAlaArgArgGlyArgGluIleLeuLeuGlyProAlaAspGlyMetValSer 
3001 CCTGTTTCCGCCCGCAGGGGCCGGGAGATACTGCTCGGGCCAGCCGATGGAATGGTCTCC 
GGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGTCGGCTACCTTACCAGAGG 

LysGlyTrpArgLeuLeuAlaProIleThrAlaTyrAlaGlnGlnThrArgGlyLei^eu 
3061 AAGGGGTGGAGGTTGCTGGCGCCCATGACGGCGTACGCCCAGCAGACAAGGGGCCTCCTA 
TTCCCCACCTCCAACGACCGCGGGTAGTGCCGCATGCGGGTCGTCTGTTCCCCGGAGGAT 

GlyCysIlelleThrSerLeuThrGlyArgAspLyaAsnGlnValGluGlyGluValGln 
3121 GGGTGCATAATCACCAGCCTAACTGGCCGGGACAAAAACCAAGTGGAGGGTGAGGTCCAG 
CCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTTTTGGTTCACCTCCCACTCCAGGTC 

ileValSerThrAlaAlaGlnThrPheLeuAlaThrCysileAsnGlyValCysTrpThr 
3181 ATTGTGTCAACTGCTGCCCAAACCTTCCTGGCAACGTGCATCAATGGGGTGTGCTGGACT 
TAACACAGTTGACGACGGGTTTGGAAGGACCGTTGCACGTAGTTACCCCACACGACCTGA 

ValTyrHisGlyAlaGlyThrArgThrlleAlaSerProLysGlyProVallleGlnMet 
3241 GTCTACCACGGGGCCGGAACGAGGACCATCGCGTCACCCAAGGGTCCTGTCATCCAGATG 
CAGATGGTGCCCCGGCCTTGCTCCTGGTAGCGCAGTGGGTTCCCAGGACAGTAGGTCTAC 

Ser Thr 

TyrThrAsnValAspGlnAspLeuValGlyTrpProAlaProGlnGlySerArgSerLeu 
3301 TATACCAATGTAGACCAAGACCTTGTGGGCTGGCCCGCTCCGCAAGGTAGCCGCTCATTG 
ATATGGTTACATCTGGTTCTGGAACACCCGACCGGGCGAGGCGTTCCATCGGCGAGTAAC 

ThrProCysThrCysGlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallle 
3361 ACACCCTGCACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGATGTCATT 
TGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGCTCCGTGCGGCTACAGTAA 

ProValArgArgArgGlyAspSerArgGlySerLeuLeuSerProArgProlleSerTyr 
3421 CCCGTGCGCCGGCGGGGTGATAGCAGGGGCAGCCTGCTGTCGCCCCGGCCCATTTCCTAC 
GGGCACGCGGCCGCCCCACTATCGTCCCCGTCGGACGACAGCGGGGCCGGGTAAAGGATG 

LeuLysGlySerSerGlyGlyProLexiLeuCysProAlaGlyHisAlaValGlyllePhe 
3481 TTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTGCCCCGCGGGGCACGCCGTGGGCATATTT 
AACTTTCCGAGGAGCCCCCCAGGCGACAACACGGGGCGCCCCGTGCGGCACCCGTATAAA 

ArgAlaAlaValCysThrArgGlyValAlaLysAlaValAspPhelleProValGluAsn 
3541 AGGGCCGCGGTGTGCACCCGTGGAGTGGCTAAGGCGG1GGACTTTATCCCTGTGGAGAAC 
TCCCGGCGCCACACGTGGGCACCTCACCGATTCCGCCACCTGAAATAGGGACACCTCTTG 
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LeuGluThrThjrMetArgSerProValPheThrAsDAsnSerSerProProValValPro 
3601 CTAGAGACAACCATGAGGTCCCCG6TGTTCACGGATAACTCCTCTCCACCAGTAGTGCCC 
GATCTCTGTTGGTACTCCAGGGGCCACAAGTGCCXATTGAGGAGAGGTGGTCATCACGGG 

GlnSerPheGlnValAlaHisLexiHisAlaProThxGlySerGlyLysSerThrLysVal 
3661 CAGAGCTTCCAGGTGGCTCACCTCCATGCTCCCACAGGCAGCGGCAAAAGCACCAAGGTC 
GTCTCGAAGGTCCACCGAGTCGAGGTACGAGGGTGTCCGTCGCCGTTTTCGTGGTTCCAG 

Prc^laAlaTyrAlaAlaGlnGlyTyrLysValLeuVall^o^nProSerValAlaAla 
3721 CCGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAACCCCTCTGTTGCTGCA 
GGCCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTGGGGAGACAACGACGT 

Leu 

ThrLeuGlyPheGlyAlaTyrMetSerLysAlaHiaGlylleAspProAsnlleArgThr 
3781 ACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGATCCTAACATCAGGACC 
TGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTAGGATTGTAGTCCTGG 

GlWalArgThrlleThrThrGlySerProlleThr^^ 
3841 GGGGTG AGAACWTTACCACTGGCAGCCCCATCACGTACTCCACCTACGGCAAG TTCCTT 
CCCCACTCTTCTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGGATGCCGTTCAAGGAA 

AlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCyaAepGluCysHisSer 
3901 GCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGTGACGAGTGCCACTCC 
CGGCTGCCGCCCACGAGCCCCCCGCGAAIACTGTATTATTAAACACTGCTCACGGTGAGG 

ThrAspAlaThrSerlleLeuGlylleGlyThrValLauAspGlnAlaGluThrAlaGly 
3961 ACGGATGCCACATCCATCTTGGGCATCGGCACTGTCCTTGACCAAGCAGAGACTGCGGGG 
TGCCTACGGTGTAGGTAGAACCCGTAGCCGTGACAGGAACTGGTTCGTCTCTGACGCCCC 

AlaAxgi^uValValLeuAlaThrAlaThrProProGlySerValThrValProHisPro 
4021 GCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTCACTGTGCCCCATCCC 
CGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAGTGACACGGGGTAGGG 

AsnlleGluGluValAlaLeuSarThrThrGlyGluIleProPheTyrGlyLysAlalle 
4081 AACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTTTACGGCAAGGCTATG 
TTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAAATGCCGTTCCGATAG 

ProX^uGluVallleLysGlyGlyArgHisLeuIlePheCysHisSerLysLysLysCys 
4141 CCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCATTCAAAGAAGAAGTGC 
GGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTAAGTTTCTTCTTCACG 

AspGluL^iaAlaAlaLysLeuVaiAlal^uGlylleAsii^aValAlaTyrTyrArgGly 
4201 GACGAACTCGCCGCAAAGCTGG TCGCATTGGGCATCAATGCCGTGGCCTACT AC CGCGGT 
CTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCACCGGATGATGGCGCCA 

LeuAspValSerVallleProThrSerGlyAspValValValValAlaThrAspAlaLeu 
4 261 CTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTGGCAACCGATGCCCTC 
GAACTGCACAGGCAGTAGGGCTGGTCGCCGCSACAACAGCAGCACCGTTGGCTACGGGAG 

Tyr 

MetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsnThrCysValThrGIn 
4321 ATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAATACGTGTGTCACCCAG 
TACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTATGCACACAGTGGGTC 

Sar 

ThrvaiA3ppneSerLeuAspProTnrPheThrileGluThrileThrLeuProGlnAsp 

4381 ACAGTCG ATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATCACGCTCCCCCAGGAT 
TGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAGTGCGAGGGGGTCCTA 

AlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLysProGlylleTyrArg 
4441 GCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAGCCAGGCATCTACAGA 
CGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTCGGTCCGTAGATGTCT 

oheValAlaProGlyGluArgProSerGlyMetPheAspSerSerValLeuCysGluCys 
4501 TTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACXCGTCCGTCCTCTGTGAGTGC 
AAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGGCAGGAGACACTCACG 



* 
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TyrAspAlaGlyCyaAlaTrpTyrGluLeuThrProAlaGluThrThrValArgLeuArg 
4561 TATGACGCAG0CTGTGCTTG3TATGAGCTCACGCCCGCCGAGACTACAGTTAGGCTACGA 
ATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGATGTCAATCCGATGCT 

AlaTyrMetAflnThrProGlyLeuProValCyaGlnAapHiflLeuGluPheTrpGluGly 
4621 GCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTTGAATTTTGGGAGGGC 
CGCAT^TACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAACTTAAAACCCTCCCG 

ValPheThrGlyLeuThrHialleAspAlaHiaPheLeuSerGlnThrLysGlnSerGly 
4681 GTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGG 
CAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTCTGTTTCGTCTCACCC 

GluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAlaArgAlaGlnAlaPro 
4741 GAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCTAGGGCTCAAGCCCCT 
CTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGATCCCGAGTTCGGGGA 

ProProSerTrpAspGlnMetTrpLysCysLeuIleArgLev^ysProThrLeuHisGly 
4801 CCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAGCCCACCCTCCATGGG 
GGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTCGGGTGGGAGGTACCC 

ProThrProLeuLeuTyrArgLeuGlyAlavalGlnAanGluileThrlLeuThrHisPro 
4861 CCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATCACCCTGACGCACCCA 
GGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAGTGGGACTGCGTGGGT 

valThrLysTyrlleMetThrCysMetSerAlaAspLeuGluValValThrSerThrTrp 
4 921 GTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTCGTCACGAGCACCTGG 
CAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAGCAGTGCTCGTGGACC 

ValI^uValGlyGlyValI^uAlaAlaI^uAlaJU.aTyrCysLeuSerThrGlyCysVal 
4981 GTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTGTCAACAGGCTGCGTG 
CACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGACAGTTGTCCGACGCAC 

, Fig. 17-6 

ValllevalGlyArgValValLeuSerGlyLysProAlallelleProAspArgGluVal 
5041 GTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATACCTGACAGGGAAGTC 
CAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTATGGACTGTCCCTTCAG 

LeuTyrArgGluPheAspGluMfetGluGluCysSerGlnHiaLeuProTyrlleGluGln 
5101 CTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCGTACATCGAGCAA 
GAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAATGGCATGTAGCTCGTT 

GlyMet^tLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeuLeuGlnThrAlaSer 
5 16 1 GGGATG ATGCTCGCCGAGCAGTTCAAGCAG AAGGCCCTCGGCCTCCTGCAGACCGCGTCC 
CCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAGGACGTCTGGCGCAGG 

ArgGlnAlaGluValllaAlaProAlaValGlnThrAsnTrpGlnLysLeuGluThrPhe 
5221 CGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAAAAACTCGAGACCTTC 
GCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTTTTTGAGCTCTGGAAG 

TrpAlaLysHi8MetTrpAsnPheIleSerGlyIleGlnTyrLeuAlaGlyLeuSerThr 
5281 TGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTGGCGGGCTTGTCAACG 
ACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAACCGCCCGAACAGTTGC 

LeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAlaAlaValThrSerPro 
5341 CTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCTGCTGTCACCAGCCCA 
GACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGACGACAGTGGTCGGGT 

I^uThrThrSerGlnThrL«uLeuPheAsnIl6l^uGlyGlyTrpValAlaAlaGlr^u 

5401 CTAACCACTAGCO^CCCTCCTCTTCAACATATTGGGGGGGTGGGTC^ 

GATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACCCACCGACGGGTCGAG 

AleLAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAlaGlyAlaAlalleGly 
5461 GCCGCCCCCGGKCGGCTACTGCCTTTGTGGGCGCTGGCTTAGCTGGCGCCGCCATCGGC 
CGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGACCGCGGCGGTAG 

SarValGlyLeuGlyLysValLauileAspIleLeuAlaGlyTyrGlyAlaGlyValAla 
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'5521 AGTGTTGGACTGGSGAAGGTCCTCATAGACATCCTTGCAGGGTATGGCGCGGGC6TGGCG 
TCACAACCTGACCCCT^CC^GGAGTATCTGTAGGAACGTCCCATACCGCGCCCGCACCGC 

Gly 

GlyAlaLeuValAlaPheLysileMetSerGlyGluValProSerThrGliiAspLeuVal 
,591 GGAGCTCTTGTGGCATTCAAGATCA1TGAGCGGTGAGGTCCCCTCCACGGAGGACCTGGTC 
CCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGGTGCCTCCTGGACCAG 

AsnLeuI^uPrcAlalleLeuSerProGlyAlaLeuValValGlyValValCysAlaAla 
5641 AATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGCGTGGTCTGTGCAGCA 
TTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCGCACCAGACACGTCGT 

Ilel^uArgArgHiaValGlyProGlyGluGlyAlaValGlnTrpMotAsnArgLeuIle 
5701 ATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGGATGAACCGGCTGATA 
TATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACCTACTTGGCCGACTAT 

AlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrValProGluSerAspAla 
5761 GCCTTCGCCXCCCGGGGGAACCATGXTTCCCCCACGCACTACGTGCCGGAGAGCGATGCA 
CGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCACGGCCTCTCGCTACGT 

HisCys 

AlaAlaArgValThrAlallsLeuSerSarl^uThrValThrGlnLeuLeuArgArgLeu 
5821 GCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAGCTCCTGAGGCGACTG 
CGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTCGAGGACTCCGCTGAC 

HisGlnTrpIleSerSerGluCysThrThrProCysSerGlySerTrpLeuArgAspIle 
5881 CACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCCTGGCTAAGGGACATC 
GTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGGACCGATTCCCTGTAG 

TrpAspTrpIleCysGluValLauSerAspPheLysThxTrpLeuLysAlaLysLeuMet 
5941 TGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTAAAAGCTAAGCTCATG , 
ACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGATTTTCGATTCGAGTAC * x ? 

ProGlnLeuProGlyllePrcPheValSerCysGlnArgGlyTyrLysGlyValTrpArg 
6001 CCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTATAAGGGGGTCTGGCGA 
GGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATATTCCCCCAGACCGCT 

Gly 

ValAspGlylleMetHisThrArgCysHisCysGlyAlaGluIleThrGlyHisValLys 
6061 GTGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATCACTGGACATGTCAAA 

CACCTGCCGTAGTACGTGTGAGGGACGGTGACACCTCGACTCTAGTGACCTGTACAGTTT _ 

AsnGlyThr^etAxglleValGlyPraArgThrCysArgAsnMetTrpSerGlyThrPhe 
6121 AACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATGTGGAGTGGGACCTTC 
TTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTACACCTCACCCTGGAAG 

ProlleAsnAlaiyrThrThrGlyProCysThrProLauPrcAlaPraAsnTyrThrPhe 
6181 CCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCGCCGAACTACACGTTC 
GGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGCGGCTTGATGTGCAAG 

AlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGlnValGlyAspPheHis 
6241 GCGCXATGGAGGGTGTCTGCAGAGGAATATGTGGAGATAAGGCAGGTGGGGGACTTCCAC 
CGCGATACCTCCCACAGACGTCTCCTTATACACCTCTATTCCGTCCACCCCCTGAAGGTG 

TyrValThrGlyMetThrThrAspAsnlieuLysCysProCysGlnValProSerProGlu 
6301 TACGTGACGGGTATGACTACTGACAATCTCAAATGCCCGTGCCAGGTCCCATCGCCCGAA 
ATGCACTGCCCATACTGATGACTGTTAGAGTTTACGGGCACGGTCCAGGGTAGCGGGCTT 

PhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaProProCysLysProLeu 
6361 TTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCCCCCTGCAAGCCCTTG 
AAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCC^ACGCGGGGGGACGTTCGGGAAC 

LeuArgGluGluValSerPheArgValGlyLeuHisGluTyrProValGlySerGlnLeu 
6421 CTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCGGTAGGGTCGCAATTA 
nArnrrrTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGCCATCCCAGCGTTAAT 
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6481 i^^^^ScK^siii^CKCMSTACSlSTGACIAaMiGaGIA 
TATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGGGGGAC3ACACCGGTCG 



.541 ATAACAGCA 



6661 



6721 



6781 



AGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACGTGGCGATTGGTACTG 
AG^GACracaSSsTiTCICCSGIIGSAmAIACCTCCGTCCTCTMOCGCCGTTS 
TAOTGGTCCCA^TCAGTGTTTTGTTTCACCACTAAGACCTGAGGAAGCTAGGCGAACAC 

AG^GGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAGIGACTTAGTTGGGAT 

702 i SggSSSSSSS^^ 

.081 S?BS~^^ 




7201 jgggggss^ 



7321 



7381 



mBBBBaBBSBBE 



^gISaaIg^g^^SSgiaaacigtctgacgttcaagac 
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AspSerHisTyrGinAsoValLeuLysGluValLysAlaAlaAlaSerLysValLysAla 
7441 GACAGCGATTACCAGGACGTACTC^ 

CTGTCGGTAATGGTCCTCCATCAGTTCCTC 

Phe 

AsnLeuLeuSarValGluGluAlaCysSerLeuThrProProHisSerAlaLysSfirLys 
7501 AACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACACTCAGCCAAATCCAAG 
TTGAACGAIAGGCATCTCCTTCGAACG.TCGGACTGCGGGGGTGTGAGTCGGTTTAGGTTC 

PheGlyTyrGlyAlaLysAapValAxgCysHisAlaArgLyaAlaValThrHislleAsn 
7561 TTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCCGTAACCCACATCAAC 
AAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGGCATTGGGTGTAGTTG 

SexValTrpLysAspI^ul^uGluAspAsnValTlirPro 
7621 TCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGACACTACCATCATGGCT 
AGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTGTGATGGTAGTACCGA 

LysAsnGluValPheCysValGlnProGluLysGlyGlyArgLysProAlaAxgLeuIle 
7681 AAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAGCCAGCTCGTCTCATC 
TTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTCGGTCGAGCAGAGTAG 

ValPheProAspLeuGlyValArgValCysGluLysMetAlaLeuTyrAspValValT2ir 
7741 GTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTGTACGACGTGGTTACA 
CACAAGGGGCTAGACCCGCACGCGCACACGCTTTXCTACCGAAACATGCTGCACC^TGT 

Lysl^uProLeuAlaValiletGlySerSerTyrGlyPheGlnTyrSerProGlyGlnArg 
7801 AAGCTCCCCTTGGCCG!TGATGGGAAGCTCCTACGGATTCCaATACTCACCAGGACAGCGG 
TTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATGAGTGGTCCTGTCGCC 

ValGluPheLeuValGinAlaTrpLysSerLysLyeThrProMetGlyPheSerTyrAsp 
7861 GTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATGGGGTTCTCGTATGAT 
CAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTACCCCAAGAGCATACTA 

ThrArgCyaPheAspSerThrValThrGluSerAspIleArgThrGluGluAlalleTyr 
7921 ACCCGCTGCTTTGACTCCACAGTGACTGAGAGCGACATCCGTACGGAGGAGGCAATCTAC 
TGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGCCTCCTCCGTTAGATG 

GlnCysCysAspLeiiAspProGlnAlaArgValAlalleLysSerLeuThrGluArgLeu 
7981 CAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCCCTCACCGAGAGGCTT 
GTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGGGAGTGGCTCTCCGAA 

Gly 

TyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGlyTyrArgArgCysArg 
3041 TATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGCTATCGCAGGXGCCGC 
ATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCGATAGCGTCCACGGCG 

AlaSerGlyValieuThrThrSerCysGlyAsnThrLeuThrC^sTyrlleLysAlaArg 
8101 GCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGCTACATCAAGGCCCGG 
CGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACGATGTAGTTCCGGGCC 

AlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuValCysGlyAspAspLeu 

8161 GCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTGTGTGGCGACGACTTA 
CGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCACACACCGCTGCXGAAT 

Valval I leCysGluSerAlaG lyValGlnGluAs pAlaAlaS erLeuArgAla PheThr 
8221 GTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGCCTGAGAGCCTTCACG 
CAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCGGACTCTCGGAAGTGC 

GluAlaMetThrArgTyrSerAlaProProGlyAspProProGlnProGluTyrAspLeu 
8281 GAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAACCAGAATACGACTTG 
CTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTTGGTCTTATGCTGAAC 

GliiLeuIleThrSerCysSerSerAsnValSerValAlaHiaAspGlyAlaGlyLysArg 
8341 GAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCG.CCCACGACGGCGCTGGAAAGAGG 
CTCGAGTATTGTAGTACGAGGAGGTTGCACAGTGAGCGGGTGCTGCCGCGACCTTTCTCC 



ValTyrT7rLeuThrArg^^ProThrThrProLeuAlaArgAlaAlS^)GluThrAla 
8401 GTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCTGCGTGGGAGACAGCA 
CAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGACGCACCCTCTGTCGT 

ArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPheAlaProThrLeuTrp 
8461 AGACACACTCCAGTCAASTCCTGGCTAGGCAACATAATCATGTTTGCCCCCACACTGTGG 
TCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAACGGGGGTGTGACACC 

AlaArgMetZl«LeuMetThrHisPhePheSerValLeulleAlaArgAspGlnLeuGlu 
8 521 GCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCCAGGGACCAGCTTGAA 
CGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGGTCCCTGGDCGAACTT 

GlnAlaLeiiAspCysGlulleTyrGlyAlaCysTyrSerlleGluProLauAapLeuPro 
8581 CAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAACCACTTGATCTACCT 
GTCCGGGAGCTAAGGCTCTAGATGCCCCGGACGATGAGGTATCTTGGTGAACTAGATGGA 

I 

ProIlelleGlnAxgLeuHisGlyLeuSerAlaPheSerLeuHlsSerTyrSerProGly 
8641 CCAATCATTCAAAGACTCCATGGCCTCAGCGCAMTTCACTCCACAGTTACTCTCCAGGT 
GGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTGTCAATGAGAGGTCCA 

GluIleAsnAxgValAlaAlaCysLeuArgLysLcuGlyValProProLeuArgAlaTrp 
8701 GAAATTAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCGCCCTTGCGAGCTTGG 
CTTTAATTATCCCACCGGCGTACGGAGTCTTTTGAACCCCAIGGCGGGAACGCTCGAACC 

Gly 

ArgHisArgAlaArgSerValArgAlaArgLeuLeviAlaArgGlyGlyArgAlaAlalle 
8761 AGACACCGGGCCCGGAGCGTCCGCGCTAGGGTTCTGGCCAGAGGAGGCAGGGCTGCCATA 

TCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCTCCGTCCCGACGGTAT ~~ 

CysGlyLysTyrLeuPheAsnTrpAlaValArgThrLyaLeuLys 
8 821 TGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAAC 

ACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTTG Pig. 17-10 
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